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A  certified  binary  is  a  value  together  with  a  proof  that  the  value  satisfies  a  given  specification. 
Existing  compilers  that  generate  certified  code  have  focused  on  simple  memory  and  control-flow 
safety  rather  than  more  advanced  properties.  In  this  paper,  we  present  a  general  framework  for 
explicitly  representing  complex  propositions  and  proofs  in  typed  intermediate  and  assembly  lan¬ 
guages.  The  new  framework  allows  us  to  reason  about  certified  programs  that  involve  effects 
while  still  maintaining  decidable  typechecking.  We  show  how  to  integrate  an  entire  proof  sys¬ 
tem  (the  calculus  of  inductive  constructions)  into  a  compiler  intermediate  language  and  how  the 
intermediate  language  can  undergo  complex  transformations  (CPS  and  closure  conversion)  while 
preserving  proofs  represented  in  the  type  system.  Our  work  provides  a  foundation  for  the  process 
of  automatically  generating  certified  binaries  in  a  type-theoretic  framework. 
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1.  INTRODUCTION 

Proof-carrying  code  (PCC),  as  pioneered  by  Necula  and  Lee  [1996]  [Necula 
1997],  allows  a  code  producer  to  provide  a  machine-language  program  to  a 
host,  along  with  a  formal  proof  of  its  safety.  The  proof  can  be  mechanically 
checked  by  the  host;  the  producer  need  not  be  trusted  because  a  valid  proof  is 
incontrovertible  evidence  of  safety. 

The  PCC  framework  is  general  because  it  can  be  applied  to  certify  arbitrary 
data  objects  with  complex  specifications  [Necula  1998;  Appel  and  Felten  2001]. 
For  example,  the  Foundational  PCC  system  [Appel  and  Felty  2000]  can  certify 
any  property  expressible  in  Church’s  higher-order  logic.  Harper  [2000]  and 
Burstall  and  McKinna  [1991]  call  all  these  proof-carrying  constructs  certified 
binaries  (or  deliverables).  A  certified  binary  is  a  value  (which  can  be  a  function, 
a  data  structure,  or  a  combination  of  both)  together  with  a  proof  that  the  value 
satisfies  a  given  specification. 

Unfortunately,  little  is  known  on  how  to  construct  or  generate  certified  bi¬ 
naries.  Most  existing  certifying  compilers  [Necula  and  Lee  1998;  Colby  et  al. 
2000]  have  focused  on  simple  memory  and  control-flow  safety  only.  Typed  in¬ 
termediate  languages  [Harper  and  Morrisett  1995]  and  typed  assembly  lan¬ 
guages  [Morrisett  et  al.  1998]  are  effective  techniques  for  automatically  gener¬ 
ating  certified  code;  however,  none  of  these  type  systems  can  rival  the  expres¬ 
siveness  of  the  actual  higher-order  predicate  logic  (which  could  be  used  in  any 
Foundational  PCC  system). 

In  this  paper,  we  present  a  type-theoretic  framework  for  constructing,  com¬ 
posing,  and  reasoning  about  certified  binaries.  Our  plan  is  to  use  the  formulae- 
as-types  principle  [Howard  1980]  to  represent  propositions  and  proofs  in  a 
general  type  system,  and  then  to  investigate  their  relationship  with  compiler 
intermediate  and  assembly  languages.  We  show  how  to  integrate  an  entire 
proof  system  (the  calculus  of  inductive  constructions  [Paulin-Mohring  1993; 
Coquand  and  Huet  1988])  into  an  intermediate  language,  and  how  to  define 
complex  transformations  (CPS  and  closure  conversion)  of  programs  in  this  lan¬ 
guage  so  that  they  preserve  proofs  represented  in  the  type  system.  Our  paper 
builds  upon  a  large  body  of  previous  work  in  the  logic  and  theorem-proving 
community  (see  [Barendregt  and  Geuvers  1999;  Barendregt  1991]  for  a  good 
summary),  and  makes  the  following  new  contributions: 

— We  show  how  to  design  new  typed  intermediate  languages  that  are  capable 
of  representing  and  manipulating  propositions  and  proofs.  In  particular,  we 
show  how  to  maintain  decidability  of  typechecking  when  reasoning  about 
certified  programs  that  involve  effects.  This  is  different  from  the  work  done 
in  the  logic  community  which  focuses  on  strongly  normalizing  (primitive  re¬ 
cursive)  programs. 

— We  maintain  a  phase  distinction  between  compile-time  typechecking  and 
run-time  evaluation.  This  property  is  often  lost  in  the  presence  of  depen¬ 
dent  types  (which  are  necessary  for  representing  proofs  in  predicate  logic). 
We  achieve  this  by  never  having  the  type  language  (see  Section  3)  depen¬ 
dent  on  the  computation  language  (see  Section  4).  Proofs  are  instead  always 

ACM  Transactions  on  Programming  Languages  and  Systems,  Vol.  TBD,  No.  TDB,  Month  Year. 


A  Type  System  for  Certified  Binaries 


3 


represented  at  the  type  level  using  dependent  kinds. 

— We  show  how  to  use  propositions  to  express  program  invariants  and  how  to 
use  proofs  to  serve  as  static  capabilities.  Following  Xi  and  Pfenning  [1999], 
we  use  singleton  types  [Hayashi  1991]  to  support  the  necessary  interaction 
between  the  type  and  computation  languages.  We  can  assign  an  accurate 
type  to  unchecked  vector  (or  array)  access  (see  Section  4.3).  Xi  and  Pfenning 
[1999]  can  achieve  the  same  using  constraint  checking,  but  their  system  does 
not  support  arbitrary  propositions  and  (explicit)  proofs,  so  it  is  less  general 
than  ours. 

— We  use  a  single  type  language  to  typecheck  different  compiler  intermediate 
languages.  This  is  crucial  because  it  is  impractical  to  have  separate  proof 
libraries  for  each  intermediate  language.  We  achieve  this  by  using  inductive 
definitions  to  define  all  types  used  to  classify  computation  terms.  This  in 
turn  nicely  fits  our  work  on  (fully  reflexive)  intensional  type  analysis  [Tri¬ 
fonov  et  al.  2000]  into  a  single  system. 

— We  show  how  to  perform  CPS  and  closure  conversion  on  our  intermediate 
languages  while  still  preserving  proofs  represented  in  the  type  system.  Ex¬ 
isting  algorithms  [Morrisett  et  al.  1998;  Harper  and  Lillibridge  1993;  Mi- 
namide  et  al.  1996;  Barthe  et  al.  1999]  all  require  that  the  transformation 
be  performed  on  the  entire  type  language.  This  is  impractical  because  proofs 
are  large  in  size;  transforming  them  can  alter  their  meanings  and  break  the 
sharing  among  different  languages.  We  present  new  techniques  that  com¬ 
pletely  solve  these  problems  (Sections  5-6). 

— Our  type  language  is  a  variant  of  the  calculus  of  inductive  constructions  of 
Paulin-Mohring  [1993]  and  Coquand  and  Huet  [1988].  Following  Werner 
[1994],  we  give  rigorous  proofs  for  its  meta-theoretic  properties  (subject  re¬ 
duction,  strong  normalization,  confluence,  and  consistency  of  the  underly¬ 
ing  logic).  We  also  give  the  soundness  proof  for  our  sample  computation 
language.  See  Sections  3-4,  the  appendix,  and  the  companion  technical  re¬ 
port  [Shao  et  al.  2001]  for  details. 

As  far  as  we  know,  our  work  is  the  first  comprehensive  study  on  how  to  incor¬ 
porate  higher-order  predicate  logic  (with  inductive  terms  and  predicates)  into 
typed  intermediate  languages.  Our  results  are  significant  because  they  open 
up  many  new  exciting  possibilities  in  the  area  of  type-based  language  design 
and  compilation.  The  fact  that  we  can  internalize  a  very  expressive  logic  into 
our  type  system  means  that  formal  reasoning  traditionally  done  at  the  meta 
level  can  now  be  expressed  inside  the  actual  language  itself.  For  example, 
much  of  the  past  work  on  program  verification  using  Hoare-like  logics  may 
now  be  captured  and  made  explicit  in  a  typed  intermediate  language. 

From  the  standpoint  of  type-based  language  design,  recent  work  [Harper 
and  Morrisett  1995;  Xi  and  Pfenning  1999;  Crary  et  al.  1999;  Walker  2000; 
Crary  and  Weirich  2000;  Trifonov  et  al.  2000]  has  produced  many  specialized, 
increasingly  complex  type  systems,  each  with  its  own  meta-theoretical  proofs, 
yet  it  is  unclear  how  they  will  fit  together.  We  can  hope  to  replace  them  with 
one  very  general  type  system  whose  meta  theory  is  proved  once  and  for  all,  and 
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that  allows  the  definition  of  specialized  type  operators  via  the  general  mecha¬ 
nism  of  inductive  definitions.  For  example,  inductive  definitions  subsume  and 
generalize  earlier  systems  for  intensional  type  analysis  [Harper  and  Morrisett 
1995;  Crary  and  Weirich  1999;  Trifonov  et  al.  2000]. 

We  have  a  prototype  implementation  of  our  new  type  system  in  the  FLINT 
compiler  [Shao  1997;  Shao  et  al.  1998],  but  making  the  implementation  real¬ 
istic  still  involves  solving  many  remaining  problems  ( e.g efficient  proof  rep¬ 
resentations).  Nevertheless,  we  believe  our  current  contributions  constitute  a 
significant  step  toward  the  goal  of  providing  a  practical  end-to-end  compiler 
that  generates  certified  binaries. 

2.  APPROACH 

Our  main  objectives  are  to  design  typed  intermediate  and  low-level  languages 
that  can  directly  manipulate  propositions  and  proofs,  and  then  to  use  them  to 
certify  realistic  programs.  We  want  our  type  system  to  be  simple  but  general; 
we  also  want  to  support  complex  transformations  (CPS  and  closure  conver¬ 
sion)  that  preserve  proofs  represented  in  the  type  system.  In  this  section,  we 
describe  the  main  challenges  involved  in  achieving  these  goals  and  give  a  high- 
level  overview  of  our  main  techniques. 

Before  diving  into  the  details,  we  first  establish  a  few  naming  conventions 
that  we  will  use  in  the  rest  of  this  paper.  Typed  intermediate  languages  are 
usually  structured  in  the  same  way  as  typed  A-calculi.  Figure  1  gives  a  frag¬ 
ment  of  a  richly  typed  A-calculus,  organized  into  four  levels:  kind  schema 
( kscm )  u,  kind  k,  type  r,  and  expression  (exp)  e.  If  we  ignore  kind  schema  and 
other  extensions,  this  is  just  the  higher-order  polymorphic  A-calculus  Fu  [Gi¬ 
rard  1972]. 

We  divide  each  typed  intermediate  language  into  a  type  sub-language  and 
a  computation  sub-language.  The  type  language  contains  the  top  three  levels. 
Kind  schemas  classify  kind  terms  while  kinds  classify  type  terms.  We  often 
say  that  a  kind  term  n  has  kind  schema  u,  or  a  type  term  r  has  kind  n.  We 
assume  all  kinds  used  to  classify  type  terms  have  kind  schema  Kind,  and  all 
types  used  to  classify  expressions  have  kind  fi.  Both  the  function  type  ri  — >  t2 
and  the  polymorphic  type  Vf  :  k.t  have  kind  fi.  Following  the  tradition,  we 
sometimes  say  “a  kind  k”  to  imply  that  k  has  kind  schema  Kind,  “a  type  r”  to 
imply  that  r  has  kind  fi,  and  “a  type  constructor  r”  to  imply  that  r  has  kind 

“k  — » - >  fi.”  Kind  terms  with  other  kind  schemas,  or  type  terms  with  other 

kinds  are  strictly  referred  to  as  “kind  terms”  or  “type  terms.” 

The  computation  language  contains  just  the  lowest  level  which  is  where  we 
write  the  actual  program.  This  language  will  eventually  be  compiled  into  ma¬ 
chine  code.  We  often  use  names  such  as  computation  terms,  computation  val¬ 
ues,  and  computation  functions  to  refer  to  various  constructs  at  this  level. 

2.1  Representing  propositions  and  proofs 

The  first  step  is  to  represent  propositions  and  proofs  for  a  particular  logic  in  a 
type-theoretic  setting.  The  most  established  technique  is  to  use  the  formulae- 
as-types  principle  (a.k.a.  the  Curry-Howard  correspondence)  [Howard  1980]  to 
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The  type  language: 

( kscm )  u  ::=  Kind  |  . ... 

(kind)  /e  : :=  ki  — ►  K2  |  |  ■  ■  ■ 

(type)  T  : :  =  t  |  At :  k.  t  \  n  T2  |  n  — >  T2  |  Vt :  k.  r  |  . . . 

The  computation  language: 

(exp)  e  ::=  x  \  \x:r.e  |  ei  e2  |  kt-.n.e  |  e[r]  |  .  . . 
Fig.  1.  Typed  A-calculi — a  skeleton 


map  propositions  and  proofs  into  a  typed  A-calculus.  The  essential  idea,  which 
is  inspired  by  constructive  logic,  is  to  use  types  (of  kind  O)  to  represent  propo¬ 
sitions,  and  expressions  to  represent  proofs.  A  proof  of  an  implication  PdQ  is 
a  function  object  that  yields  a  proof  of  proposition  Q  when  applied  to  a  proof  of 
proposition  P.  A  proof  of  a  conjunction  P  A  Q  is  a  pair  (ei,  e2)  such  that  e\  is  a 
proof  of  P  and  e2  is  a  proof  of  Q.  A  proof  of  disjunction  P  V  Q  is  a  pair  ( 6 ,  e) — a 
tagged  union — where  6  is  either  0  or  1  and  if  6=0,  then  e  is  a  proof  of  P;  if  6=1 
then  e  is  a  proof  of  Q.  There  is  no  proof  for  the  false  proposition.  A  proof  of 
a  universally  quantified  proposition  \/x£B.P(x)  is  a  function  that  maps  every 
element  6  of  the  domain  B  into  a  proof  of  P(b)  where  P  is  a  unary  predicate 
on  elements  of  B.  Finally,  a  proof  of  an  existentially  quantified  proposition 
3 x£B.P(x)  is  a  pair  (6,  e)  where  6  is  an  element  of  B  and  e  is  a  proof  of  P(6). 

Proof-checking  in  the  logic  now  becomes  typechecking  in  the  corresponding 
typed  A-calculus.  There  has  been  a  large  body  of  work  done  along  this  line  in 
the  last  30  years;  most  type-based  proof  assistants  are  based  on  this  funda¬ 
mental  principle.  Good  surveys  of  the  previous  work  in  this  area  can  be  found 
in  Barendregt  [1991]  and  Barendregt  and  Geuvers  [1999]. 

2.2  Representing  certifi  ed  binaries 

Under  the  type-theoretic  setting,  a  certified  binary  S  is  just  a  pair  (v,  e)  that 
consists  of: 

— a  value  v  of  type  r  where  v  could  be  a  function,  a  data  structure,  or  any 
combination  of  both; 

— and  a  proof  e  of  P(v)  where  P  is  a  unary  predicate  on  elements  of  type  r. 

Here  e  is  just  an  expression  with  type  P(v).  The  predicate  P  is  a  dependent 
type  constructor  with  kind  r  — >  fi.  The  entire  package  S  has  a  dependent 
strong-sum  type  E x:t.P(x). 

For  example,  suppose  Nat  is  the  domain  for  natural  numbers  and  Prime 
is  a  unary  predicate  that  asserts  an  element  of  Nat  as  a  prime  number;  we 
introduce  a  type  nat  representing  Nat,  and  a  type  constructor  prime  (of  kind 
nat— >  fl)  representing  Prime.  We  can  build  a  certified  prime-number  package  by 
pairing  a  value  v  (a  natural  number)  with  a  proof  for  the  proposition  prime(r); 
the  resulting  certified  binary  has  type  Ex:  nat.  prime(x). 
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Function  values  can  be  certified  in  the  same  way.  Given  a  function  /  that 
takes  a  natural  number  and  returns  another  one  as  the  result  (be.,  /  has  type 
nat  — >  nat),  in  order  to  show  that  /  always  maps  a  prime  to  another  prime,  we 
need  a  proof  for  the  following  proposition: 

Vx(zNat.  Prime  (x)  D  Prime(f(x)) 

In  a  typed  setting,  this  universally  quantified  proposition  is  represented  as  a 
dependent  product  type: 

ILr:nat.  prime(a:)  — >  prime(/(x)) 

The  resulting  certified  binary  has  type 

E/:nat  — >  nat.  Ilaunat.  prime(x)  — >  prime(/(x)) 

Here  the  type  is  not  only  dependent  on  values  but  also  on  function  applications 
such  as  f(x),  so  verifying  the  certified  binary,  which  involves  typechecking  the 
proof,  in  turn  requires  evaluating  the  underlying  function  application. 

2.3  The  problems  with  dependent  types 

The  above  scheme  unfortunately  fails  to  work  in  the  context  of  typed  interme¬ 
diate  (or  assembly)  languages.  There  are  at  least  four  problems  with  depen¬ 
dent  types;  the  third  and  fourth  are  present  even  in  the  general  context. 

First,  real  programs  often  involve  effects  such  as  assignment,  I/O,  or  non¬ 
termination.  Effects  interact  badly  with  dependent  types.  In  our  previous 
example,  suppose  the  function  /  does  not  terminate  on  certain  inputs;  then 
clearly,  typechecking — which  could  involve  applying  / — would  become  unde- 
cidable.  It  is  possible  to  use  the  effect  discipline  [Sheldon  and  Gifford  1990]  to 
force  types  to  be  dependent  on  pure  computation  only,  but  this  does  not  work  in 
some  typed  A-calculi;  for  example,  a  “pure”  term  in  Girard’s  XU  [Girard  1972] 
could  still  diverge. 

Even  if  applying  /  does  not  involve  any  effects,  we  still  have  more  seri¬ 
ous  problems.  In  a  type-preserving  compiler,  the  body  of  the  function  /  has 
to  be  compiled  down  to  typed  low-level  languages.  A  few  compilers  perform 
typed  CPS  conversion  [Morrisett  et  al.  1998],  but  in  the  presence  of  dependent 
types,  this  is  a  very  difficult  problem  [Barthe  et  al.  1999].  Also,  typecheck¬ 
ing  in  low-level  languages  would  now  require  performing  the  equivalent  of 
/3-reductions  on  the  low-level  (assembly)  code;  this  is  awkward  and  difficult  to 
support  cleanly. 

Third,  it  is  important  to  maintain  a  phase  distinction  between  compile-time 
typechecking  and  run-time  evaluation.  But  having  dependent  strong-sum  and 
product  types  makes  it  harder  to  preserve  this  property,  especially  if  the  type- 
dependent  values  are  first-class  citizens  (certified  binaries  are  used  to  validate 
arbitrary  data  structures  and  program  functions  so  they  should  be  allowed  to 
be  passed  as  arguments,  returned  as  results,  or  stored  in  memory). 

Finally,  supporting  subset  types  in  the  presence  of  dependent  strong-sum 
and  product  types  is  difficult  if  not  impossible  [Constable  1985;  Nordstrom 
et  al.  1990].  A  certified  binary  of  type  Ex  :  nat.  prime(x)  contains  a  natural 
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number  v  and  a  proof  that  v  is  a  prime.  However,  in  many  cases,  we  just  want 
v  to  belong  to  a  subset  type  {x  :  nat  |  prime(x)},  i.e.,  v  is  a  prime  number  but 
the  proof  of  this  is  not  together  with  v;  instead,  it  can  be  constructed  from  the 
current  context. 

2.4  Separating  the  type  and  computation  languages 

We  solve  these  problems  by  making  sure  that  our  type  language  is  never  depen¬ 
dent  on  the  computation  language.  Because  the  actual  computation  term  has 
to  be  compiled  down  to  assembly  code  in  any  case,  it  is  a  bad  idea  to  treat  it  as 
part  of  types.  This  separation  immediately  gives  us  back  the  phase-distinction 
property. 

To  represent  propositions  and  proofs,  we  lift  everything  one  level  up:  we  use 
kinds  to  represent  propositions,  and  type  terms  for  proofs.  The  domain  Nat  is 
represented  by  a  kind  Nat;  the  predicate  Prime  is  represented  by  a  dependent 
kind  term  Prime  which  maps  a  type  term  of  kind  Nat  to  a  proposition.  A  proof 
for  proposition  Prime(n)  certifies  that  the  type  term  n  is  a  prime  number. 

To  maintain  decidable  typechecking,  we  insist  that  the  type  language  is 
strongly  normalizing  and  free  of  side  effects.  This  is  possible  because  the  type 
language  no  longer  depends  on  any  runtime  computation.  Given  a  type-level 
function  g  of  kind  Nat  — *  Nat,  we  can  certify  that  it  always  maps  a  prime  to 
another  prime  by  building  a  proof  tp  for  the  following  proposition,  now  repre¬ 
sented  as  a  dependent  product  kind: 

IK :  Nat.Prime(t)  — >  Prime(p(f)). 

Essentially,  we  circumvent  the  problems  with  dependent  types  by  replacing 
them  with  dependent  kinds  and  by  lifting  everything  (in  the  proof  language) 
one  level  up. 

To  reason  about  actual  programs,  we  still  have  to  connect  terms  in  the  type 
language  with  those  in  the  computation  language.  We  follow  Xi  and  Pfenning 
[1999]  and  use  singleton  types  [Hayashi  1991]  to  relate  computation  values  to 
type  terms.  In  the  previous  example,  we  introduce  a  singleton  type  constructor 
snat  of  kind  Nat^  fi.  Given  a  type  term  n  of  kind  Nat,  if  a  computation  value  v 
has  type  snat(n),  then  v  denotes  the  natural  number  represented  by  n. 

A  certified  binary  for  a  prime  number  now  contains  three  parts:  a  type  term 
n  of  kind  Nat,  a  proof  for  the  proposition  Prime(n),  and  a  computation  value 
of  type  snat(n).  We  can  pack  it  up  into  an  existential  package  and  make  it  a 
first-class  value  with  type: 

3n:  Nat.3t :  Prime(n).snat(n). 

Here  we  use  3  rather  than  E  to  emphasize  that  types  and  kinds  are  no  longer 
dependent  on  computation  terms.  Under  the  erasure  semantics  [Crary  et  al. 
1998],  this  certified  binary  is  just  an  integer  value  of  type  snat(n)  at  run  time. 

Because  there  are  strong  separation  between  types  and  computation  terms, 
a  value  v  of  type  3n  :  Nat.3£  :  Prime(n).snat(n)  is  still  implemented  as  a  single 
integer  at  runtime  thus  achieving  the  effect  of  the  subset  type. 
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We  can  also  build  certified  binaries  for  programs  that  involve  effects.  Re¬ 
turning  to  our  example,  assume  again  that  /  is  a  function  in  the  computation 
language  which  may  not  terminate  on  some  inputs.  Suppose  we  want  to  certify 
that  if  the  input  to  /  is  a  prime,  and  the  call  to  /  does  return,  then  the  result  is 
also  a  prime.  We  can  achieve  this  in  two  steps.  First,  we  construct  a  type-level 
function  g  of  kind  Nat  — >  Nat  to  simulate  the  behavior  of  /  (on  all  inputs  where 
f  does  terminate)  and  show  that  /  has  the  following  type: 

Vn:Nat.  snat(n)  — »  snat (g(n)) 

Here  following  Figure  1,  we  use  V  and  — » to  denote  the  polymorphic  and  func¬ 
tion  types  for  the  computation  language.  The  type  for  /  says  that  if  it  takes  an 
integer  of  type  snat(n)  as  input  and  does  return,  then  it  will  return  an  integer 
of  type  snatU/fn)).  Second,  we  construct  a  proof  rp  showing  that  g  always  maps 
a  prime  to  another  prime.  The  certified  binary  for  /  now  also  contains  three 
parts:  the  type-level  function  g,  the  proof  tp,  and  the  computation  function  f 
itself.  We  can  pack  it  into  an  existential  package  with  type: 

3g :  Nat— >  Nat.  3p:  (lit :  Nat.Prime(t)  — >  Prime(g(f))). 

Vn:Nat.  snat(n)  — >  snat(p(n)) 

Notice  this  type  also  contains  function  applications  such  as  g(n),  but  g  is  a 
type-level  function  which  is  always  strongly  normalizing,  so  typechecking  is 
still  decidable. 

It  is  important  to  understand  the  difference  between  typechecking  and  “type 
inference.”  The  main  objective  of  this  paper  is  to  develop  a  fully  explicit  frame¬ 
work  where  proofs  and  assertions  can  be  used  to  certify  programs  that  may 
contain  side  effects — the  most  important  property  is  that  typechecking  (and 
proof-checking)  in  the  new  framework  must  be  decidable.  Type  inference  (i.e., 
finding  the  proofs),  on  the  other  hand,  could  be  undecidable:  given  an  arbitrar¬ 
ily  complex  function  /,  we  clearly  cannot  hope  to  automatically  construct  the 
corresponding  g.  In  practice,  however,  it  is  often  possible  to  first  write  down 
the  specification  g  and  then  to  write  the  corresponding  program  /.  Carrying 
out  this  step  and  constructing  the  proof  that  /  follows  g  is  a  challenging  task, 
as  in  any  other  PCC  system  [Necula  1998;  Appel  and  Felty  2000]. 

2.5  Designing  the  type  language 

We  can  incorporate  propositions  and  proofs  into  typed  intermediate  languages, 
but  designing  the  actual  type  language  is  still  a  challenge.  For  decidable  type¬ 
checking,  the  type  language  should  not  depend  on  the  computation  language 
and  it  must  satisfy  the  usual  meta-theoretical  properties  ( e.g .,  strong  normal¬ 
ization). 

But  the  type  language  also  has  to  fulfill  its  usual  responsibilities.  First,  it 
must  provide  a  set  of  types  (of  kind  f 1)  to  classify  the  computation  terms.  A 
typical  compiler  intermediate  language  supports  a  large  number  of  basic  type 
constructors  (e.g.,  integer,  array,  record,  tagged  union,  and  function).  These 
types  may  change  their  forms  during  compilation,  so  different  intermediate 
languages  may  have  different  definitions  of  fi;  for  example,  a  computation 
function  at  the  source  level  may  be  turned  into  CPS-style,  or  later,  to  one  whose 
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arguments  are  machine  registers  [Morrisett  et  al.  1998].  We  also  want  to  sup¬ 
port  intensional  type  analysis  [Harper  and  Morrisett  1995]  which  is  crucial  for 
typechecking  runtime  services  [Monnier  et  al.  2001]. 

Our  solution  is  to  provide  a  general  mechanism  of  inductive  definitions  in 
our  type  language  and  to  define  each  such  0  as  an  inductive  kind.  This  was 
made  possible  only  recently  [Trifonov  et  al.  2000]  and  it  relies  on  the  use  of 
polymorphic  kinds.  Taking  the  type  language  in  Figure  1  as  an  example,  we 
add  kind  variables  k  and  polymorphic  kinds  I  Jfc  :  u.  n,  and  replace  f l  and  its 
associated  type  constructors  with  inductive  definitions  (not  shown): 

(kscm)  u  ::=  Kind  |  . . . 

(kind)  k  ::=  Ki— >k2  |  k  |  Ilfc :  u.  n  \  . . . 

(type)  t  ::=  t\  \t:n.T  \t\T2  \  \k:u.  t  \  t[k }  \  . . . 

At  the  type  level,  we  add  kind  abstraction  A k  :  u.  r  and  kind  application  t[k}. 
The  kind  Q  is  now  inductively  defined  as  follows  (see  Sections  3-4  for  more 
details): 

Inductive  fi  :  Kind  :=—*■:  fi— >fi— 

|  V  :  Ilk:  Kind.  (fc-> 

Here  — »  and  V  are  two  of  the  constructors  (of  f>).  The  polymorphic  type  Vt :  k.  t 
is  now  written  as  V[«]  (Af :  k.  t);  the  function  type  n  — >r2  is  just  — »  tit2. 

Inductive  definitions  also  greatly  increase  the  programming  power  of  our 
type  language.  We  can  introduce  new  data  objects  ( e.g integers,  lists)  and 
define  primitive  recursive  functions,  all  at  the  type  level;  these  in  turn  are 
used  to  help  model  the  behaviors  of  the  computation  terms. 

To  have  the  type  language  double  up  as  a  proof  language  for  higher-order 
predicate  logic,  we  add  dependent  product  kind  lit:  n\.  «2,  which  subsumes  the 
arrow  kind  m  — >  k2;  we  also  add  kind-level  functions  to  represent  predicates. 
Thus  the  type  language  naturally  becomes  the  calculus  of  inductive  construc¬ 
tions  [Paulin-Mohring  1993]. 

2.6  Proof-preserving  compilation 

Even  with  a  proof  system  integrated  into  our  intermediate  languages,  we  still 
have  to  make  sure  that  they  can  be  CPS-  and  closure-converted  down  to  low- 
level  languages.  These  transformations  should  preserve  proofs  represented  in 
the  type  system;  in  fact,  they  should  not  traverse  the  proofs  at  all  since  doing 
so  is  impractical  with  large  proof  libraries. 

These  challenges  are  nontrivial  but  the  way  we  set  up  our  type  system  makes 
it  easier  to  solve  them.  First,  because  our  type  language  does  not  depend  on 
the  computation  language,  we  do  not  have  the  difficulties  involved  in  CPS- 
converting  dependently  typed  A-calculi  [Barthe  et  al.  1999].  Second,  all  our  in¬ 
termediate  languages  share  the  same  type  language,  thus  also  the  same  proof 
library;  this  is  possible  because  the  fi  kind  (and  the  associated  types)  for  each 
intermediate  language  is  just  a  regular  inductive  definition. 
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Finally,  a  type-preserving  program  transformation  often  requires  translat¬ 
ing  the  source  types  (of  the  source  O  kind)  into  the  target  types  (of  the  target  O 
kind).  Existing  CPS-  and  closure-conversion  algorithms  [Morrisett  et  al.  1998; 
Harper  and  Lillibridge  1993;  Minamide  et  al.  1996]  all  perform  this  translation 
at  the  meta-level;  they  have  to  go  through  every  type  term  (thus  every  proof 
term  in  our  setting)  during  the  translation,  because  any  type  term  may  con¬ 
tain  a  sub-term  which  has  the  source  O  kind.  In  our  framework,  the  fact  that 
each  O  kind  is  inductively  defined  means  that  we  can  internalize  and  write  the 
type-translation  function  inside  our  type  language  itself.  This  leads  to  elegant 
algorithms  that  do  not  traverse  any  proof  terms  but  still  preserve  typing  and 
proofs  (see  Sections  5-6  for  details). 

2.7  Putting  it  all  together 

A  certifying  compiler  in  our  framework  will  have  a  series  of  intermediate  lan¬ 
guages,  each  corresponding  to  a  particular  stage  in  the  compilation  process; 
all  will  share  the  same  type  language.  An  intermediate  language  is  now  just 
the  type  language  plus  the  corresponding  computation  terms,  along  with  the 
inductive  definition  for  the  corresponding  O  kind.  In  the  rest  of  this  paper,  we 
first  give  a  formal  definition  of  our  type  language  (which  will  be  named  TL 
from  now  on)  in  Section  3;  we  then  present  a  sample  computation  language  A H 
in  Section  4;  we  show  how  A h  can  be  CPS-  and  closure-converted  into  low-level 
languages  in  Sections  5-6;  finally,  we  discuss  related  work  and  then  conclude. 

3.  THE  TYPE  LANGUAGE  TL 

Our  type  language  TL  resembles  the  calculus  of  inductive  constructions  (ClC) 
implemented  in  the  Coq  proof  assistant  [Huet  et  al.  2000].  This  is  a  great 
advantage  because  Coq  is  a  very  mature  system  and  it  has  a  large  set  of  proof 
libraries  which  we  can  potentially  reuse.  For  this  paper,  we  decided  not  to 
directly  use  ClC  as  our  type  language  for  three  reasons.  First,  ClC  contains 
some  features  designed  for  program  extraction  [Paulin-Mohring  1989]  which 
are  not  required  in  our  case  (where  proofs  are  only  used  as  specifications  for 
the  computation  terms).  Second,  as  far  as  we  know,  there  are  still  no  formal 
studies  covering  the  entire  ClC  language.  Third,  for  theoretical  purposes,  we 
want  to  understand  what  are  the  most  essential  features  for  modeling  certified 
binaries.  In  practice  these  differences  are  fairly  minor.  The  main  objectives  of 
this  section  is  to  give  a  quick  introduction  to  the  essential  features  in  the  Coq- 
like  dependent  type  theory. 

3.1  Motivations 

Following  the  discussion  in  Section  2.5,  we  organize  TL  into  the  following  three 
levels: 

(/ kscm )  u  z  \  UtiK.u  |  II k:u.u'  j  Kind 

(kind)  k  ::=  k  \  Xt:n.  k'  \  k[t\  \  Xk :  u.  k  \  kk'  |  lit :  n.  k!  |  II k :  u.  n 
|  ILiKscm.K  |  lnd(/c:Kind){«}  |  Elim^', w](t){k} 

(type)  t  t  I  Xt:n.  t  I  tt'  I  Xk:u.r  I  t\k]  I  Ax;: Kscm. r  I  t\u) 

|  Ctor (z, «)  |  Elim[/s/,/s](r'){f} 
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Here  kind  schemas  ( kscm )  classify  kind  terms  while  kinds  classify  type  terms. 
There  are  variables  at  all  three  levels:  kind-schema  variables  z,  kind  variables 
k,  and  type  variables  f.  We  have  an  external  constant  Kscm  classifying  all  the 
kind  schemas;  essentially,  TL  has  an  additional  level  above  kscm,  of  which 
Kscm  is  the  sole  member. 

A  good  way  to  comprehend  TL  is  to  look  at  its  five  II  constructs:  there  are 
three  at  the  kind  level  and  two  at  the  kind-schema  level.  We  use  a  few  exam¬ 
ples  to  explain  why  each  of  them  is  necessary.  Following  the  tradition,  we  use 
arrow  terms  ( e.g k i  — >  k2)  as  a  syntactic  sugar  for  the  non-dependent  II  terms 
(e.g.,  Ilf  :ki  .  k2  is  non-dependent  if  t  does  not  occur  free  in  re2). 

— Kinds  lit  :  k.  k'  and  k  —>  n'  are  used  to  typecheck  the  type-level  function 
A t :  k.t  and  the  corresponding  application  form  n  r2.  Assuming  fi  and  Nat 
are  inductive  kinds  (defined  later)  and  Prime  is  a  predicate  with  kind  schema 
Nat— >  Kind,  we  can  write  a  type  term  such  as  At :  fi.  t  which  has  kind  fi  — > fi,  a 
type-level  arithmetic  function  such  as  plus  which  has  kind  Nat— >  Nat— >  Nat,  or 
the  universally  quantified  proposition  in  Section  2.2  which  is  represented  as 
the  kind  Ilf :  Nat.Prime(f)— >  Prime(c/(f)). 

— Kinds  II k:u.  k  and  u  —?  k  are  used  to  typecheck  the  type-level  kind  abstrac¬ 
tion  A k:u.T  and  its  application  form  t\k].  As  mentioned  in  Section  2.5,  this 
is  needed  to  support  intensional  analysis  of  quantified  types  [Trifonov  et  al. 
2000].  It  can  also  be  used  to  define  logic  connectives  and  constants,  as  in 

True  :  Kind  =  life :  Kind,  /c— 

False  :  Kind  =  Uk  :  Kind,  k 

True  has  the  polymorphic  identity  as  a  proof: 

id  :  True  =  Xk:  Kind.  Xt:k.  t 

but  False  is  not  inhabited  (this  is  essentially  the  consistency  property  of  TL 
which  we  will  show  later). 

— Kind  Hz :  Kscm.  k  is  used  to  typecheck  the  type-level  kind-schema  abstraction 
Az :  Kscm.  r  and  the  corresponding  application  t[u].  This  is  not  in  the  core  cal¬ 
culus  of  constructions  [Coquand  and  Huet  1988].  We  use  it  in  the  inductive 
definition  of  fi  (see  Section  4)  where  both  the  VKscm  and  3Kscm  constructors 
have  kind  Uz :  Kscm.  (z— >fi)  — >fi.  These  two  constructors  in  turn  allow  us  to 
typecheck  predicate-polymorphic  computation  terms,  which  occur  fairly  of¬ 
ten  since  the  closure-conversion  phase  turns  all  functions  with  free  predicate 
variables  (e.g,  Prime)  into  predicate-polymorphic  ones. 

— Kind  schemas  lit :  k.u  and  k  — >  u  are  used  to  typecheck  the  kind-level  type 
abstraction  At :  k.  k'  and  the  application  form  k[t].  The  predicate  Prime  has 
kind  schema  Nat^  Kind.  A  predicate  with  kind  schema  Ilf :  Nat.  Prime(f)  — ■>  Kind 
is  only  applicable  to  prime  numbers.  We  can  also  define  for  instance  a  binary 
relation: 


LT  :  Nat— > Nat— >  Kind 

so  that  LT  fi  f2  is  a  proposition  asserting  that  the  natural  number  repre¬ 
sented  by  fi  is  less  than  that  of  f2. 
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Inductive  Nat  :  Kind  :=  zero: Nat 

|  succ:  Nat  ^  Nat 

plus  :  Nat— >NaWNat 

plus(zero)  =  At:  Nat.  t 

plus(succ  t)  =  At' :  Nat.  succ  ((plus  t)  t') 

ifez  :  Nat^  (nfc:  Kind,  (Nat^  A;)  — >  k) 
ifez(zero)  =  \k :  Kind.  Ati :  k.  Xt2  :  Nat^  k.  t\ 

ifez(succ  t)  =  Afc :  Kind.  Ati :  fc.  At2  :  Nat^  fc.  t2  t 


Inductive  Bool  :  Kind  :=true  :Bool 
|  false :  Bool 

le  :  Nat— *  Nat  —*  Bool 

le(zero)  =  At:  Nat.  true 

le(succ  t)  =  At' :  Nat.  ifez  t'  Bool  false  (le  t) 

It :  Nat^  Nat^  Bool 
it  =  At:  Nat.  le  (succ  t) 

Cond  :  Bool  — >  Kind  — >  Kind  — >  Kind 

Cond(true)  =  Afci :  Kind.  Afc2  :  Kind,  fci 
Cond(false)  =  Afci :  Kind.  Afc2  :  Kind.  fc2 


Fig.  2.  Examples  of  inductive  definitions  and  elimination 


— Kind  schemas  II /c  :  u.  u'  and  u  — *  v!  are  used  to  typecheck  the  kind-level 
function  A k  :  u.  k  and  the  application  form  K  \  tt2.  We  use  it  to  write  higher- 
order  predicates  and  logic  connectives.  For  example,  the  logical  negation 
operator  can  be  written  as  follows: 

Not  :  Kind  — >  Kind  =  Xk :  Kind,  fc— >  False 

The  consistency  of  TL  implies  that  a  proposition  and  its  negation  cannot  be 
both  inhabited — otherwise  applying  the  proof  of  the  second  to  that  of  the 
first  would  yield  a  proof  of  False. 

TL  also  provides  a  general  mechanism  for  defining  inductive  types  [Paulin- 
Mohring  1993].  The  term  lnd(fc  :  Kind)  { R\  introduces  an  inductive  kind  k  with 
constructors  whose  kinds  are  listed  in  k.  Here  k  must  only  occur  “positively” 
inside  each  k,  (see  Appendix  A  for  the  formal  definition  of  positivity).  The  term 
Ctor  (i,  k)  refers  to  the  ?-th  constructor  in  an  inductive  kind  k.  For  presentation, 
we  will  use  a  more  friendly  syntax  in  the  rest  of  this  paper.  An  inductive  kind 
I  =  lnd(fc:  Kind){(«}  will  be  written  as: 

Inductive  /  :  Kind  :=  Ci  :[I/k]n\ 

|  c2  :  [I/k]n 2 


Cn  :  [I /k]Kn 

We  give  an  explicit  name  c,  to  each  constructor,  so  c,  is  just  an  abbreviation  of 
Ctor  (i,  /).  For  simplicity,  the  current  version  of  TL  does  not  include  parame¬ 
terized  inductive  kinds,  but  supporting  them  is  quite  straightforward  [Werner 
1994;  Paulin-Mohring  1993]. 

TL  provides  two  iterators  to  support  primitive  recursion  on  inductive  kinds. 
The  small  elimination  Elim [«',  k](t'){t}  takes  a  type  term  t'  of  inductive  kind 
performs  the  iterative  operation  specified  by  t  (which  contains  a  branch 
for  each  constructor  of  and  returns  a  type  term  of  kind  k[t']  as  the  result. 
The  large  elimination  Elim^',  v.](t)  { k\  takes  a  type  term  r  of  inductive  kind 
performs  the  iterative  operation  specified  by  k,  and  returns  a  kind  term  of  kind 
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(sort)  s  ::=  Kind  |  Kscm  |  Ext 
(var)  X  ::=  z\k\t 

( ptm )  A,B  ::=  s  \  X  \  XX:A.B  \  A  B  \  YIX:A.B 

|  IndpsTKindHA}  |  Ctor(i,A)  |  Elim[A',  B'](A){B} 

Fig.  3.  Syntax  of  the  type  language  TL 


schema  u  as  the  result.  These  iterators  generalize  the  Typerec  operator  used  in 
intensional  type  analysis  [Harper  and  Morrisett  1995;  Crary  and  Weirich  1999; 
Trifonov  et  al.  2000]. 

Figure  2  gives  a  few  examples  of  inductive  definitions  including  the  induc¬ 
tive  kinds  Bool  and  Nat  and  several  type-level  functions  which  we  will  use  in 
Section  4.  The  small  elimination  for  Nat  takes  the  form  Elim[Nat,  k](t'){ti;  72}. 
Here,  k  is  a  dependent  kind  with  kind  schema  Nat  — >  Kind;  r'  is  the  argument 
which  has  kind  Nat.  The  term  in  the  zero  branch,  n,  has  kind  k{t'\.  The  term 
in  the  succ  branch,  r2,  has  kind  Nat  — >  k[t']  — >  k[t'\.  TL  uses  the  /.-reduction  to 
perform  the  iterator  operation.  For  example,  the  two  /-reduction  rules  for  Nat 
work  as  follows: 

Elim[Nat.  K](zero){ri;  r2}  t\ 

Elim[Nat,  k](succ  r){n;  t2}  r2  r  (Elim[Nat,  k](t){ti;  t2}) 

The  general  /.-reduction  rule  is  defined  formally  in  Appendix  A.  In  our  exam¬ 
ples,  we  take  the  liberty  of  using  the  pattern-matching  syntax  (as  in  ML)  to 
express  the  iterator  operations,  but  they  can  be  easily  converted  back  to  the 
Elim  form. 

In  Figure  2,  plus  is  a  function  which  calculates  the  sum  of  two  natural  num¬ 
bers.  The  function  ifez  behaves  like  a  switch  statement:  if  its  argument  is  zero, 
it  returns  a  function  that  selects  the  first  branch;  otherwise,  the  result  takes 
the  second  branch  and  applies  it  to  the  predecessor  of  the  argument.  The  func¬ 
tion  le  evaluates  to  true  if  its  first  argument  is  less  than  or  equal  to  the  second. 
The  function  It  performs  the  less-than  comparison. 

The  definition  of  function  Cond,  which  implements  a  conditional  with  result 
at  the  kind  level,  is  expanded  into  TL  using  large  elimination  on  Bool,  of  the 
form  Elim[Bool,  u](t){k  1;  k2},  where  r  is  of  kind  Bool,  and  both  the  true  and  false 
branches  (k  \  and  n2)  have  kind  schema  u. 

3.2  Formalization 

We  want  to  give  a  formal  semantics  to  TL  and  then  reason  about  its  meta- 
theoretic  properties.  But  the  five  II  constructs  have  many  similarities,  so  in 
the  rest  of  this  paper,  we  will  model  TL  as  a  pure  type  system  (PTS)  [Baren- 
dregt  1991]  extended  with  inductive  definitions.  Intuitively,  instead  of  having 
a  separate  syntactic  category  for  each  level,  we  collapse  all  kind  schemas  u, 
kind  terms  n,  type  terms  r,  and  the  external  constant  Kscm  into  a  single  set 
of  pseudoterms  (ptm),  denoted  as  A  or  B.  Similar  constructs  can  now  share 
typing  rules  and  reduction  relations. 
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Figure  3  gives  the  syntax  of  TL,  written  in  PTS  style.  There  is  now  only 
one  II  construct  (IIX :  A.  B),  one  A-abstraction  (AX :  A.  B),  and  one  application 
form  (A  B );  two  iterators  for  inductive  definitions  are  also  merged  into  one 
(ENm^T,  B'](A){B}).  We  use  X  and  Y  to  represent  generic  variables,  but  we 
will  still  use  t,  k,  and  2  if  the  class  of  a  variable  is  specific. 

TL  has  the  following  PTS  specification  which  we  will  use  to  derive  its  typing 
rules: 


S  =  {Kind.  Kscm.  Ext} 

A  =  {Kind:  Kscm,  Kscm :  Ext} 

1Z  =  {(Kind,  Kind),  (Kscm,  Kind),  (Ext,  Kind), 

(Kind,  Kscm),  (Kscm,  Kscm)} 

Here  S  is  the  set  of  emphsorts  used  to  denote  universes.  We  have  added  the 
constant  Ext  to  support  quantification  over  Kscm.  The  names  we  use  for  sorts 
reflect  the  fact  that  we  have  lifted  the  language  one  level  up;  they  are  related 
to  other  systems  via  the  following  table: 


System 

Notation 

TL 

Kind 

Kscm 

Ext 

Werner  [1994] 

Set 

Type 

Ext 

Coq/ClC  [Huet  et  al.  2000] 

Set,  Prop 

Type(0) 

Type(l) 

Barendregt [1991] 

* 

□ 

A 

The  axioms  in  the  set  A  denote  the  relationship  between  different  sorts;  an 
axiom  “si  :  s2”  means  that  s2  classifies  si.  The  pairs  (rules)  in  the  set  1Z  are 
used  to  define  the  well-formed  II  constructs,  from  which  we  can  deduce  the  set 
of  well-formed  A-definitions  and  applications.  For  example,  the  five  rules  for 
TL  can  be  related  to  the  five  II  constructs  through  the  following  table: 


UX-.A.B 

XX :  A.  B 

A  B 

(Kind,  Kind) 

lit :  K\ .  k2 

Xt’.K.T 

n  t2 

(Kscm,  Kind) 

II k :  u.  k 

\k\u.T 

t[k\ 

(Ext,  Kind) 

Uz\  Kscm.  u 

A 2 :  Kscm.  r 

t[m] 

(Kind,  Kscm) 

Ut-.u.u 

At:  Ki.  k2 

k[t] 

(Kscm,  Kscm) 

Uk:u\.u2 

Xk-.u.u 

k  k! 

We  define  a  context  A  as  a  list  of  bindings  from  variables  to  pseudoterms: 

( ctxt )  A  ::=  •  |  A,X:A 

The  typing  judgment  for  TL  in  PTS  style  now  takes  the  form  A  h  A  :  A', 
meaning  that  within  context  A,  the  pseudoterm  A  is  well-formed  and  has  A! 
as  its  classifier.  We  can  now  write  a  single  typing  rule  for  all  the  II  constructs: 

A  h  i  :  «i  X,  X :  A  \-  B  :  s2  (s\,  s2)  £  7Z 
A  h  UX-.A.B  -.  s2 
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Taking  rule  (Kind,  Kscm)  as  an  example,  to  build  a  well-formed  term  HX :  A.  B, 
which  will  be  a  kind  schema  (because  s2  is  Kscm),  we  need  to  show  that  A  is  a 
well-formed  kind  and  B  is  a  well-formed  kind  schema  assuming  X  has  kind  A. 
We  can  also  share  the  typing  rules  for  all  A-definitions  and  applications: 

A,X:AhB:B'  Abll  X:A.B':s 

— - -  (FUN) 

A  h  XX-.A.B  :  nx :  A.  B' 


A  I-  A:  ILX:B'.A'  A  h  B  :  B' 

Ah  A  B:[B/X]  A'  (APP) 

The  reduction  relations  can  also  be  shared.  TL  supports  the  standard  (3-  and  77- 
reductions  (denoted  by  and  ~»J?)  plus  the  previously  mentioned  /-reduction 
(denoted  by  ~»t)  on  inductive  objects  (see  Appendix  A).  The  relations  [>,3,  \>v, 
and  D>,  are  the  contextual  closures  of  the  relations  and  respectively. 

We  use  'w  and  >  for  the  unions  of  the  above  relations.  We  also  write  =pVt  for 
the  reflexive,  symmetric,  and  transitive  closure  of  >. 

The  complete  typing  rules  for  TL  and  the  definitions  of  all  the  reduction  re¬ 
lations  are  given  in  Appendix  A.  Following  Werner  [1994]  and  Geuvers  [1993], 
we  have  shown  that  TL  satisfies  all  the  key  meta-theoretic  properties,  includ¬ 
ing  subject  reduction,  strong  normalization,  Church-Rosser  (and  confluence), 
and  consistency  of  the  underlying  logic.  The  detailed  proofs  for  these  proper¬ 
ties  are  given  in  the  companion  technical  report  [Shao  et  al.  2001]. 


Theorem  3.1  (Subject  reduction)  If  the  judgment  A  h  A  :  B  is  derivable, 
and  A  >  A’ ,  then  A  h  A1  :  B  is  derivable. 

Proof  sketch  The  detailed  proof  is  given  in  the  companion  technical  re¬ 
port  [Shao  et  al.  2001].  We  first  define  a  calculus  of  unmarked  terms.  These 
are  TL  terms  with  no  annotations  at  lambda  abstractions.  We  show  that  this 
language  is  confluent.  From  this,  we  can  prove  that  TL  satisfies  a  weak  form 
of  confluence  (also  known  as  the  Geuvers  lemma  [Geuvers  1993]);  it  says  that  a 
term  that  is  equal  to  one  in  head  normal  form  can  be  reduced  to  an  77-expanded 
version  of  this  head  normal  form.  From  the  weak  confluence,  we  then  prove  the 
inversion  lemma  which  relates  the  structure  of  a  term  to  its  typing  derivation. 
We  then  prove  the  uniqueness  of  types  and  subject  reduction  for  (3l  reductions. 
Finally,  we  prove  the  strengthening  lemma  and  then  subject  reduction  for  77 
reduction.  □ 


Theorem  3.2  (Strong  normalization)  All  well  typed  terms  are  strongly 
normalizing. 

Proof  sketch  The  detailed  proof  is  presented  in  our  technical  report  [Shao 
et  al.  2001].  It  is  a  straightforward  extension  of  the  proof  given  by  Werner 
[1994].  First  we  introduce  a  calculus  of  pure  terms ;  this  is  just  the  pure  A- 
calculus  extended  with  a  recursive  filtering  operator;  we  do  this  so  that  we 
can  operate  in  a  confluent  calculus.  We  then  define  a  notion  of  reducibility 
candidates;  every  kind  schema  gives  rise  to  a  reducibility  candidate;  we  also 
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show  how  these  candidates  can  be  constructed  inductively.  We  define  a  no¬ 
tion  of  well  constructed  kinds  which  is  a  weak  form  of  typing.  We  associate  an 
interpretation  to  each  well  formed  kind.  We  show  that  under  adequate  condi¬ 
tions,  this  interpretation  is  a  candidate.  We  show  that  type  level  constructs 
such  as  abstractions  and  constructors  belong  to  the  candidate  associated  with 
their  kind.  We  show  that  the  interpretation  of  a  kind  remains  the  same  un¬ 
der  /3r)  reduction.  We  then  define  a  notion  of  kinds  that  are  invariant  on  their 
domain — these  are  kinds  whose  interpretation  remains  the  same  upon  reduc¬ 
tion.  We  show  that  kinds  formed  with  large  elimination  are  invariant  on  their 
domain.  From  here  we  can  show  the  strong  normalization  of  the  calculus  of 
pure  terms;  we  show  that  if  a  type  is  well  formed,  then  the  pure  term  derived 
from  it  is  strongly  normalizing.  Finally,  we  reduce  the  strong  normalization  of 
all  well  formed  terms  to  the  strong  normalization  of  pure  terms.  □ 

Theorem  3.3  (Church-Rosser)  Let  Ah  A:  B  and  A  h  A'  :  B  be  two 
derivable  judgments.  If  A  =ihv  A! ,  and  if  A  and  A '  are  in  normal  form,  then 

A  =  A'. 

Proof  sketch  The  detailed  proof  is  given  in  the  companion  technical  re¬ 
port  [Shao  et  al.  2001].  We  first  prove  that  a  well  typed  term  in  (3l  normal  form 
has  the  same  //  reductions  as  its  corresponding  unmarked  term.  From  here,  we 
know  that  if  A  and  A!  are  in  normal  form,  then  their  corresponding  unmarked 
terms  are  equal.  We  then  show  that  the  annotations  in  the  A-abstractions  are 
equal.  □ 

Theorem  3.4  (Consistency  of  the  logic)  There  exists  no  term  A  for  which 
•  h  A:  False. 

Proof  sketch  Suppose  hi  is  a  term  for  which  ■  A  :  False.  By  Theorem  3.2, 
there  exists  a  normal  form  B  for  A.  By  Theorem  3.1  •  h  B  :  False.  We  can  show 
now  that  this  leads  to  a  contradiction  by  case  analysis  of  the  possible  normal 
forms  of  types  in  the  calculus.  □ 

4.  THE  COMPUTATION  LANGUAGE  Xh 

The  language  of  computations  A h  for  our  high-level  certified  intermediate  for¬ 
mat  uses  proofs,  constructed  in  the  type  language,  to  verify  propositions  which 
ensure  the  runtime  safety  of  the  program.  Furthermore,  in  comparison  with 
other  higher-order  typed  calculi,  the  types  assigned  to  programs  can  be  more 
refined,  since  program  invariants  expressible  in  higher-order  predicate  logic 
can  be  represented  in  our  type  language.  These  more  precise  types  serve  as 
more  complete  specifications  of  the  behavior  of  program  components,  and  thus 
allow  the  static  verification  of  more  programs. 

One  approach  to  presenting  a  language  of  computations  is  to  encode  its  syn¬ 
tax  and  semantics  in  a  proof  system,  with  the  benefit  of  obtaining  machine- 
checkable  proofs  of  its  properties,  for  instance  type  safety.  This  appears  to 
be  even  more  promising  for  a  system  with  a  type  language  like  ClC,  which  is 
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(exp)  e  ::=  x  \  n  |  tt  |  ff  |  /  |  fi  x  x:  A.  f  \  e  &  \  e[A]  \  (X  =  A,  e:  A') 
|  open  e  as  (X,  x)  in  e'  |  (eo,  •  • .  e„-i)  |  sel[/l](e,  e') 

|  e  aop  e!  |  e  cop  e'  |  if  [A,  A'](e,  Xi .  ei,  X2.  C2) 
where  n  e  N 

(fun)  f  ::=  \x:A.e  \  AX  :A.  f 

(arith)  aop  ::=  +  |  ... 

(cmp)  cop  ::=  <  |  ... 

Fig.  4.  Syntax  of  the  computation  language  Ah  ■ 


more  expressive  than  higher-order  predicate  logic:  The  ClC  proofs  of  some  pro¬ 
gram  properties,  embedded  as  type  terms  in  the  program,  may  not  be  easily 
representable  in  meta-logical  terms,  thus  it  may  be  simpler  to  perform  all  the 
reasoning  in  ClC.  However  our  exposition  of  the  language  TL  is  focused  on 
its  use  as  a  type  language,  and  consequently  it  does  not  include  all  features 
of  ClC.  We  therefore  leave  this  possibility  for  future  work,  and  give  a  stan¬ 
dard  meta-logical  presentation  instead;  we  address  some  of  the  issues  related 
to  adequacy  in  our  discussion  of  type  safety. 

In  this  section  we  use  the  unqualified  “term”  to  refer  to  a  computation  term 
(expression)  e,  with  syntax  defined  in  Figure  4.  Most  of  the  constructs  are 
borrowed  from  standard  higher-order  typed  calculi.  To  simplify  the  exposi¬ 
tion  we  only  consider  constants  representing  natural  numbers  (n  is  the  value 
representing  n  €  N)  and  boolean  values  (tt  and  ff).  The  term-level  abstraction 
and  application  are  standard;  type  abstractions  and  fixed  points  are  restricted 
to  function  values,  with  the  call-by-value  semantics  in  mind  and  to  simplify 
the  CPS  and  closure  conversions.  The  type  variable  bound  by  a  type  abstrac¬ 
tion,  as  well  as  the  one  bound  by  the  open  construct  for  packages  of  existential 
type,  can  have  either  a  kind  or  a  kind  schema.  Dually,  the  type  argument  in 
a  type  application,  and  the  witness  type  term  A  in  the  package  construction 
(X  =  A,  e:A')  can  be  either  a  type  term  or  a  kind  term. 

The  constructs  implementing  tuple  operations,  arithmetic,  and  comparisons 
have  nonstandard  static  semantics,  on  which  we  focus  in  section  4.2,  but  their 
runtime  behavior  is  standard.  The  branching  construct  is  parameterized  at 
the  type  level  with  a  proposition  (which  is  dependent  on  the  value  of  the  test 
term)  and  its  proof;  the  proof  is  passed  to  the  executed  branch. 


4.1  Dynamic  semantics 

We  present  a  small  step  call-by-value  operational  semantics  for  A h  in  the  style 
of  Wright  and  Felleisen  [1994].  The  values  are  defined  inductively  by 


v  ::=  n  |  tt  |  ff  |  /  |  fix  x:A.  f  \  (X  =  A,  v:A)  \  (v0,  . . .  vn-i) 


The  reduction  relation  is  specified  by  the  following  rules. 
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(A x:A.e)v  [v/x\e 

(R-/3) 

(A X:B.f)[A\  ^  [A/X]f 

(R-TY-/3) 

sel[A]((r0,  . . .  vn-i ),m)  ^  vm  (m  <  n) 

(R-SEL) 

open  {X'  —  A,  v.A')  as  (X,  x)  in  e  [v/x][A/X\e 

(R-open) 

(fi  xx:A.f)v  ([fix  a ::A.f/x\f)v 

(R-fix) 

(fi  xx:A.f)[A]  ^  ([fix  x:A.  f/x]f)[A] 

(R-tyfix) 

rn  +  n  in  +  n 

(R-add) 

Tn  <  n  c — >  tt  (to  <  n) 

(R-lt-T) 

to  <  n  ff  (to  >  n) 

(R-lt-F) 

if  [B,  A](tt,  Xi.  ei,  Xn.  e2)  [A/X\]e\ 

(R-if-T) 

if  IB,  A](ff,  X\.  e\,  X2-  eA)  [AjXA\v-i 

(R-if-F) 

An  evaluation  context  E  encodes  the  call-by -value  discipline: 

E::=m  \  E  e  \  v  E  \  E[A }  \  (X  =  A,  E:A .')  |  open  E  as  (X,  x)  in  e 
|  (v0,  ...Vi- 1,  E,  ei+i,  e„_i)  |  sel[A](£,  e)  |  sel[A](n, E) 

|  if  [A,  A']{E,  X\.  ei,  Xi.  eA)  \  E  aop  e  \  v  aop  E  \  E  cop  e  \  v  cop  E 

The  notation  E{e}  stands  for  the  term  obtained  by  replacing  the  hole  •  in  E  by 
e.  The  single  step  computation  i— >  relates  E{e}  to  E{e'}  when  e  <— >  e' ,  and  i— >* 
is  its  reflexive  transitive  closure. 

As  shown  the  semantics  is  standard  except  for  some  additional  passing  of 
type  terms  in  R-SEL  and  R-IF-T/F.  However  an  inspection  of  the  rules  shows 
that  types  are  irrelevant  for  the  evaluation,  hence  a  type-erasure  semantics,  in 
which  all  type-related  operations  and  parameters  are  erased,  would  be  entirely 
standard. 

4.2  Static  semantics 

The  static  semantics  of  A h  shows  the  benefits  of  using  a  type  language  as  ex¬ 
pressive  as  TL.  We  can  now  define  the  type  constructors  of  A H  as  constructors 
of  an  inductive  kind  fi,  instead  of  having  them  built  into  A h-  As  we  will  show 
in  Section  5,  this  property  is  crucial  for  the  conversion  to  CPS,  since  it  makes 
possible  transforming  direct-style  types  to  CPS  types  within  the  type  language. 

Inductive  fi  :  Kind  :=  snat  :Nat— >fi 
j  sbool  : Bool— >fi 
j  — »  :fi^fi— 

j  tup  :Nat— >(Nat— >fi)— >fi 
|  VKind  : HA; :  Kind,  (fc— >fi)— »fi 
j  3 Kind  : Ilfc : Kind,  (k— >fi)— 

I  VKscm  :IIz :  Kscm.  (z— *fi)  — >fi 
|  3Kscm  '-Hz :  Kscm.  (z— +fi)  ^fi 
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Informally,  all  well-formed  computations  have  types  of  kind  Cl,  including  sin¬ 
gleton  types  of  natural  numbers  snat  A  and  boolean  values  sbool  B,  as  well  as 
function,  tuple,  polymorphic  and  existential  types.  To  improve  readability  we 
also  define  the  syntactic  sugar 


A->  B  =  ~*  AB 
\/sX :  A.  B  =  V s  A  (AX :  A.  B) 
3SX:A.B  =  3S  A  (A X-.A.B) 


where  s  £  {Kind,  Kscm} 


and  often  drop  the  sort  s  when  s  =  Kind;  for  example  the  type  void,  containing 
no  values,  is  defined  as  \/t:Cl.  t  =  VKind  Cl  (At:  Cl.  t). 

Using  this  syntactic  sugar  we  can  give  a  familiar  look  to  many  of  the  for¬ 
mation  rules  for  Ah  expressions  and  functional  values.  Figure  5  contains  the 
inference  rules  for  deriving  judgments  of  the  form  A;  T  h  e  :  A,  which  assign 
type  A  to  the  expression  e  in  a  context  A  and  a  type  environment  1  defined  by 


(type  env)  F  : :=  -  |  F,  m :  ^4 

We  introduce  some  of  the  notation  used  in  these  rules  in  the  course  of  the 
discussion. 

Rules  E-NAT,  E-TRUE,  and  E-FALSE  assign  singleton  types  to  numeric  and 
boolean  constants.  For  instance  the  constant  1  has  type  snat  (succ  zero)  in  any 
valid  environment.  In  rule  E-NAT  we  use  the  meta-function  T  to  map  natural 
numbers  n  £  N  to  their  representations  as  type  terms.  It  is  defined  inductively 
by  0  =  zero  and  n+1  =  succ  n,  so  A  b  n  :  Nat  holds  for  all  valid  A  and  n  £  N. 

Singleton  types  play  a  central  role  in  reflecting  properties  of  values  in  the 
type  language,  where  we  can  reason  about  them  constructively.  For  instance 
rules  E-ADD  and  E-LT  use  respectively  the  type  terms  plus  and  It  (defined  in 
Section  3)  to  reflect  the  semantics  of  the  term  operations  into  the  type  level  via 
singleton  types. 

However,  if  we  could  assign  only  singleton  types  to  computation  terms,  in 
a  decidable  type  system  we  would  only  be  able  to  typecheck  terminating  pro¬ 
grams.  We  regain  expressiveness  of  the  computation  language  using  existen¬ 
tial  types  to  hide  some  of  the  too  detailed  type  information.  Thus  for  example 
one  can  define  the  usual  types  of  all  natural  numbers  and  boolean  values  as 

nat  :  Cl  =  3t:  Nat.  snat  t 
bool  :  Cl  =  3t :  Bool,  sbool  t 


For  any  term  e  with  singleton  type  snat  A  the  package  ( t  =  A ,  e :  snat  t)  has  type 
nat.  Since  in  a  type-erasure  semantics  of  Ah  all  types  and  operations  on  them 
are  erased,  there  is  no  runtime  overhead  for  the  packaging.  For  each  n  £  N 
there  is  a  value  of  this  type  denoted  by  n  =  ( t  =  n,n  :  snat  t).  Operations  on 
terms  of  type  nat  are  derived  from  operations  on  terms  of  singleton  types  of  the 
form  snat  A;  for  example  an  addition  function  of  type  nat  — >  nat  — >  nat  is  defined 
as  the  expression 

add  =  Axi :  nat.  Ax2 :  nat. 

open  xi  as  (ti,  x{)  in 
open  x2  as  (t2,  x'2)  in 

(t  =  plus  t\  t-2,  x{  +  x'2 :  snat  t) 
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(TE-mt) 

(E-var) 

(E-nat) 


Ah  r  ot  A  h  A:Ji 
Ah  r,  x  :  A  ok 

A;  T  h  e  :  snat  A  A;  T  h  e'  :  snat  A' 
A;  T  h  e  +  e'  :  snat  (plus  A  A') 

A;  T  h  e  :  snat  A  A;  T  h  e'  :  snat  A' 
A;  T  h  e  <  e'  :  sbool  (It  A  A') 


(E-TRUE)  A  h  B  :  Bool  — >  Kind  A;  T  h  e  :  sbool  A" 

Ah  A:  B  A"  A,  Ai  :B  true;  T  h  ei  :  A' 

AhA':fi  A  ,X2:B  false;  T  h  e2  :  A' 

(E-false) 


(TE-ext) 


(E-add) 


(E-lt) 


(E-if) 


A;  T  h  if  [_B,  A](e,  Xi .  ei ,  X2 .  e2)  :  Al 


A;  T 

h  fi  x  x:  A.  f  :  A 

VU-riA/ 

A;  r  h  ei  :  A^A'  A;  T  h  e2  :  A 

a  h  a  :  n 

A;  r,a;:A  h  e  :  A' 

(E-fun) 

A;  r  h  ei  e2  :  z4/ 

A;  r  h 

Ax :  A.  e  :  A  — >  A' 

Ah  B  :  s 

A,  X:B\  T  h  f  :  A 

A;  T  h  e  :  Vs  B  A  Ah  A'  :  B 

A;  T  h  AX  -.B.f  :  Vs  A :  B.  A 
where  A  ^  A,  s  ^  Ext 

(E-tfun) 

A;  T  h  e[A']  :  A  A' 

where  s  Ext 

Ah  A:  B 

A  ,X:B  h  A'  :  O 

A;  r  h  e  :  B  A  A  h  A'  :  Q 

Ah  B  :  s 

A;  T  h  e  :  [A/ A] A' 

(E-pack) 

A, A :S;  r,i:AA  h  e'  :  A' 

A;  T  h  (X  =  A,  e :  A')  :  3SX :  B.  A' 
where  s  7^  Ext 


A;  T  h  open  e  as  (A,  x)  in  e'  :  A' 
where  X  ^  A ,  s  7^  Ext 


/or  all  i  <  n  A;  T  h  e;  :  A; 


A;  r  h  (eo,  . . .  e„_i)  :  tup  n  (nth  (Ao:: . . .  ::A„_i  ::nil)) 

A;  T  h  e  :  tup  A"  B  A;  T  h  e'  :  snat  A'  A  h  A  :  LT  A'  A" 

A;  r  h  sel[A](e,  e')  :  B  A' 

A;  r  h  e  :  A  A  =pVL  A'  AhAM! 

A;  T  h  e  :  A! 

Fig.  5.  Static  semantics  of  the  computation  language  A  h- 


(E-APP) 


(E-TAPP) 


(E-OPEN) 


(E-TUP) 


(E-sel) 


(E-CONV) 


Rule  E-TUP  assigns  to  a  tuple  a  type  of  the  form  tup  A  B,  in  which  the  tup 
constructor  is  applied  to  a  type  A  representing  the  tuple  size,  and  a  function  B 
mapping  offsets  to  the  types  of  the  tuple  components.  This  function  is  defined 
in  terms  of  operations  on  lists  of  types: 

Inductive  List  :  Kind  :=  nil : List  |  cons: fi— > List  — > List 

nth  :  List  — > Nat  — 

nth  nil  =  At :  Nat.  void 

nth  (cons  ti  t2)  =  At :  Nat.  ifez  t  Q  t\  (nth  t2) 
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Thus  nth  L  n  reduces  to  the  n-th  element  of  the  list  L  when  n  is  less  than  the 
length  of  L,  and  to  void  otherwise.  We  also  use  the  infix  form  Av.A'  =  cons  A  A! . 
The  type  of  pairs  is  derived:  A  x  A!  =  tup  2  (nth  (A::A'::nil)).  Thus  for  instance 
(42, 7)  :  snat  42  x  snat  7  is  a  valid  judgment. 

The  rules  for  selection  and  testing  for  the  less-than  relation  (the  only  com¬ 
parison  we  discuss  for  brevity)  refer  to  the  kind  term  LT  with  kind  schema 
Nat  — >  Nat  — +  Kind.  Intuitively,  LT  represents  a  binary  relation  on  kind  Nat,  so 
LT  fh  n  is  the  kind  of  type  terms  representing  proofs  of  to  <  n.  LT  can  be  thought 
of  as  the  parameterized  inductive  kind  of  proofs  constructed  from  instances  of 
the  axioms  Vn  €  N. 0  <  n+1  and  Vrn, n  £  N.m  <  n  D  m+1  <  n+1: 

Inductive  LT  :  Nat— > Nat— » Kind 
:=  Itzs :  Ili :  Nat.  LT  zero  (succ  t) 
j  ltss:IIt :  Nat.  lit' :  Nat.  LT  1 1' —>  LT  (succ  t)  (succ  t') 

To  simplify  the  presentation  of  our  type  language,  we  allowed  inductive  kinds 
of  kind  scheme  Kind  only.  Thus  to  stay  within  the  scope  of  this  paper  we  actually 
use  a  Church  encoding  of  LT  (given  in  Section  4.3);  this  is  sufficient  since  we 
never  analyze  proof  objects,  so  the  full  power  of  elimination  is  unnecessary  for 
our  use  of  LT. 

In  the  component  selection  construct  sel[A](e,  e')  the  type  A  represents  a 
proof  that  the  value  of  the  subscript  e'  is  less  than  the  size  of  the  tuple  e. 
In  rule  E-SEL  this  condition  is  expressed  as  an  application  of  the  type  term  LT. 
Due  to  the  consistency  of  the  logic  represented  in  the  type  language,  only  the 
existence  and  not  the  structure  of  the  proof  object  A  is  important.  Since  its 
existence  is  ensured  statically  in  a  well-formed  expression,  A  would  be  elimi¬ 
nated  in  a  type-erasure  semantics. 

The  conditional  if  [B,  A](e ,  A'i .  e\ ,  X2.  e2)  allows  information  obtained  dynam¬ 
ically  ( e.g through  comparisons)  to  be  made  available  for  static  reasoning  in 
the  form  of  proof  parameters  to  its  branches.  The  type  term  A  represents  a 
proof  of  the  proposition  encoded  by  either  B  true  or  B  false,  depending  on  the 
value  of  e.  This  proof  is  bound  to  the  type  variable  (X\  or  X2)  of  the  appropriate 
branch,  which  can  use  it  in  the  construction  of  other  proofs,  or  with  a  proof¬ 
consuming  primitive  like  sel.  The  correspondence  between  the  value  of  e  and 
the  kind  of  A  is  again  established  through  a  singleton  boolean  type.  Thus  for 
instance  if  the  run-time  value  of  e  asserts  the  truthfulness  of  some  proposition 
P,  since  the  type  parameter  A"  of  the  singleton  type  of  e  reflects  the  value  of 
e  at  the  type  level,  we  can  define  B  so  that  B  A "  represents  P  or  -<P,  depend¬ 
ing  on  whether  A"  =/gr7i  true  or  A"  =pT1i  false,  and  reason  in  each  of  the  two 
branches  under  the  assumption  that  P  or  ->P,  respectively.  Of  course,  for  this 
reasoning  to  be  sound,  we  need  a  proof  that  A"  indeed  reflects  the  truthfulness 
of  P,  that  is,  we  need  a  proof  term  A  of  kind  B  A" . 

In  fact  if  is  more  flexible  than  that,  because  B  false  does  not  have  to  be  the 
negation  of  B  true,  one  can  have  imprecise  information  flow  into  the  branches. 
In  particular  the  encoding  of  the  usual  oblivious  (in  proof-passing  sense)  if  is 
possible  using  B  =  At :  Bool.  True;  section  4.3  gives  another  example,  where  the 
information  is  precise  only  in  one  branch  of  the  conditional. 
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4.3  Example:  Bound  check  elimination 

A  simple  example  of  the  generation,  propagation,  and  use  of  proofs  in  A h  is  a 
function  which  computes  the  sum  of  the  components  of  any  vector  of  naturals. 
Let  us  first  introduce  some  auxiliary  types  and  functions.  The  type  assigned 
to  a  homogeneous  tuple  (vector)  of  n  terms  of  type  A  is  /dry-convertible  to  the 
form  vec  n  A  for 


vec  :  Nat^ff  — 

vec  =  A t :  Nat.  At' :  ft.  tup  t  (nth  (repeat  1 11)) 


where 

repeat  :  Nat  >  List 

repeat  zero  =  At' :  O.  nil 

repeat  (succ  t)  =  Af':0.t'::(repeat  t)  t! 

Then  we  can  define  a  term  which  sums  the  elements  of  a  vector  with  a  given 
length  as  follows: 

sumVec :  Vf :  Nat.  snat  t  — >  vec  t  nat  — >  nat 
=  At :  Nat.  An :  snat  t.  Av :  vec  t  nat. 

(fi  x  loop :  nat  — >  nat  — >  nat. 

Ai :  nat.  Asum :  nat.  open  i  as  ( t i')  in 

if  [LTOrTrue  t!  t,  ItPrf  t’  f] 

(i'<n, 

t\.  loop  (add  i  T)  (add  sum  (sel[fi](v,  i'))), 
t2.  sum))  0  0 


where 

LTOrTrue  :  Nat— » Nat  — >  Bool— *  Kind 

LTOrTrue  =  Afi :  Nat.  A t2 :  Nat.  A t :  Bool.  Cond  t  (LT ti  t2)  True 

and  ItPrf  of  kind  II t' :  Nat.  lit :  Nat.  LTOrTrue  t!  t  (It  t' t)  is  a  type  term  defined  be¬ 
low;  as  its  kind  suggests,  ItPrf  A  A!  evaluates  to  a  proof  of  LT  A  A',  if  A  and  A! 
represent  natural  numbers  n  and  n'  such  that  n  <  n'. 

The  comparison  i'  <  n,  used  in  this  example  as  a  loop  termination  test,  checks 
whether  the  index  i'  is  smaller  than  the  vector  size  n.  If  it  is,  the  adequacy 
of  the  type  term  It  with  respect  to  the  less-than  relation  ensures  that  the  type 
term  ItPrf  t1  t  represents  a  proof  of  the  corresponding  proposition  at  the  type 
level,  namely  LT  t1 1.  This  proof  is  then  bound  to  f  i  in  the  first  branch  of  the 
if,  and  the  sel  construct  uses  it  to  verify  that  the  i'-th  element  of  v  exists,  thus 
avoiding  a  second  test.  The  type  safety  of  A h  (Theorem  4.6)  guarantees  that 
implementations  of  sel  need  not  check  the  subscript  at  runtime.  Since  the  proof 
t2  is  ignored  in  the  “else”  branch,  ItPrf  t1  t  is  defined  to  reduce  to  the  trivial  proof 
of  True  when  the  value  of  i'  is  not  less  than  that  of  n. 

The  usual  vector  type,  which  keeps  the  length  packaged  with  the  content,  is 

vector  :  fl  — ►  O 

vector  =  At :  fi.  3 1' :  Nat.  snat  t'  x  vec  t!  t 
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Now  we  can  write  a  wrapper  function  for  sumVec  operating  on  packaged  vec¬ 
tors. 

sumVector:  vector  nat  — >  nat 
=  Av:  vector  nat. 

open  v  as  (t\  v')  in  sumVec[f']  (sel[ltPrf  0  2](v',0))  (sel[ltPrf  1  2](v',  1)) 

Next  we  show  the  type  term  ItPrf  which  generates  the  proof  of  the  proposition 
LTOrTrue  t' t  ('It  t!  t).  We  first  present  a  Church  encoding  of  the  kind  term  LT  and 
its  “constructors”  Itzs  and  Itss. 

LT  :  Nat— » Nat— >  Kind 
LT  =  At:  Nat.  AC:  Nat. 

ILR:Nat-^Nat-^  Kind. 

(Ilf :  Nat.  R  zero  (succ  t ))  — > 

(IK :  Nat.  lit' :  Nat.  R  tt'—>R  (succ  t)  (succ  £'))—> 

Rtt' 

Itzs  :  Ilf :  Nat.  LT  zero  (succ  f) 

Itzs  =  At:  Nat.  At?:  Nat— > Nat— > Kind. 

Az:  (Ilf:  Nat.  R  zero  (succ  f)). 

As:  (lit :  Nat.  Ilf' :  Nat.  Rtt'—>R  (succ  t)  (succ  t')). 

z  t 

Itss  :  Ilf :  Nat.  Ilf' :  Nat.  LT  1 1' — >LT  (succ  t)  (succ  t') 

Itss  =  At :  Nat.  At' :  Nat.  Xp:  LT  f  t'.  XR :  Nat— >  Nat  — >  Kind. 

Az:  (Ilf:  Nat.  R  zero  (succ  t)). 

As:  (lit :  Nat.  Ilf' :  Nat.  Rtt'—>R  (succ  t)  (succ  t')). 
s  tt'  (p  R  z  s) 

Next  we  define  dependent  conditionals  on  kinds  Nat  and  Bool. 

depJfez  :  lit: Nat. life: Nat  — > Kind,  k  zero— ^ » (lit' :  Nat.  k  (succ  f'))— >kt 
depJfez  zero  =  Xk:  Nat^Kind.  Ati :  k  zero.  Xt,2 :  (nC :  Nat.  k  (succ  t')).t\ 
depJfez  (succ  t)  =  Xk:  Nat— >  Kind.  Ati :  k  zero.  A t2 :  (lit' :  Nat.  k  (succ  t')).  <2  t 

dep.if  :  lit :  Bool .  n/c :  Bool  — >  Kind,  k  true  — » k  false  — >  k  t 
depJf  true  =  Afc:  Bool  — >  Kind.  Ati :  k  true.  A t2 :  k  false,  fi 
depJf  false  =  Xk :  Bool  — >  Kind.  Ati :  k  true.  A t2 :  k  false.  t2 

Note  that,  unlike  the  examples  in  Figure  2,  the  types  of  the  branches  in  each 
of  these  definitions  are  different:  The  type  of  the  true  branch  of  depJf  is 

Ilfc :  Bool  — > Kind,  k  true— ^ >k  false  — > k  true, 

while  that  of  its  false  branch  is 

Ilfc:  Bool  — >  Kind,  k  true— >/c  false— ►  k  false. 

This  is  achieved  by  specifying  the  kind  term 

At :  Bool.  Ilfc:  Bool  — ►  Kind,  k  true— false— t 
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as  the  second  parameter  of  the  Elim  construct  for  which  the  sugared  definition 
of  depJf  above  stands.  The  resulting  elimination  term  is  type-correct  because 
the  type  of  each  branch  is  obtained  by  applying  this  kind  term  to  the  corre¬ 
sponding  constructor  of  Bool. 

Finally,  we  define  some  abbreviations,  and  then  the  proof  generator  itself. 

LTcond  :  Nat  — > Nat— > Kind 

LTcond  =  A t' :  Nat.  At :  Nat.  LTOrTrue  t' t  (It  t' t) 

LTsucc  :  Nat— >Nat  — >Bool— >Kind 
LTsucc  =  A t' :  Nat.  At :  Nat.  At" :  Bool. 

LTOrTrue  t' t  <"— > LTOrTrue  (succ  t')  (succ  t)  t" 


ItPrf  :  lit' : Nat. lit:  Nat.  LTcond  t' t 
ItPrf  =  At' :  Nat. 

Elim  [Nat,  At(  :  Nat.  nti :  Nat.  LTcond  t\  ti](t,){ 

Ati :  Nat.  depJfez  ti  (LTcond  zero)  id  Itzs; 

At) :  Nat.  A tP :  (nti :  Nat.  LTcond  t\  H).  Ati :  Nat. 
depJfez 

ti 

(LTcond  (succ  t'J) 

id 

(Ati :  Nat.  depJf  (It  t\  ti)  (LTsucc  t\  ti)  (Itss  t\  t\)  (id  True)  (tp  ti))} 

4.4  Example:  Type  conversions 

The  language  A h  offers  only  the  bare  minimum  of  constructs  for  programming 
with  TL  types.  However  the  reader  may  recall  that  A h  is  an  intermediate  lan¬ 
guage,  and  ease  of  programming  in  it  is  not  necessarily  of  high  importance. 
Much  more  important  is  that  it  has  the  flexibility  to  express  the  more  complex 
relationships  between  terms  and  types  in  other  languages,  to  do  this  in  terms 
of  simple  constructs,  which  are  relatively  simple  to  reason  about  and  trans¬ 
form,  and  do  it  at  no  run-time  cost.  To  a  large  extent  this  flexibility  comes 
from  the  use  of  type-level  proof  terms  in  A h- 
One  example  of  the  power  of  programming  with  proof  terms  is  the  ability 
to  use  A h  in  a  way  which  allows  more  general  type  conversions  than  those 
permitted  by  rule  E-CONV.  This  rule  allows  the  conversion  of  a  term’s  type  only 
to  other  /'fi/i-equivalent  types,  but  not  to  types  which  are  provably  equivalent  in 
some  weaker  sense.  For  instance  it  is  impossible  to  convert  a  A //-term  of  type 
vec  (plus  t\  f2)  nat  to  a  term  of  type  vec  (plus  f2  h)  nat  in  a  context  where  the 
distinct  type  variables  t\  and  f2  have  kind  Nat,  because  the  type  terms  plus  t\  f2 
and  plus  t2  t  \ ,  being  different  normal  forms,  are  not  /T//i-equivalent. 

A  solution  is  to  instead  define  and  use  types  which  represent  equivalence 
classes  with  respect  to  a  relation  of  interest,  in  this  case  raw  datatypes  of  \H 
packaged  together  with  proof  terms  of  type  equivalence.  When  a  parameter 
of  a  type  constructor  must  be  subjected  to  conversions  in  our  program,  we 
can  replace  it  by  a  derived  type  constructor  which  hides  the  actual  “value” 
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of  this  parameter,  and  exposes  only  an  equivalent  value,  with  a  proof  of  their 
equivalence  hidden  in  the  package.  Thus  the  singleton  integer  type  snat(A)  can 
be  replaced  by  the  type  snatp(A),  defined  as  follows: 

snatp  :  Nat— 

snatp  =  A t' :  Nat.  3t :  Nat.  3P:  Eq  Nat  t' t.  snat(t) 

In  a  package  of  type  snatp (  .4)  the  variable  P  is  bound  to  a  proof  of  the  equality 
between  A  and  the  witness  type  bound  to  t,  which  represents  the  actual  value 
of  the  term-level  integer  component.  As  we  will  show  shortly,  this  allows  to 
easily  convert  a  term  of  type  snatp(A)  to  type  snatp(A')  when  A  and  A!  represent 
natural  numbers  provably  equal  in  the  given  context.  The  kind  of  equality 
proofs  Eq  can  be  defined  in  ClC  following  Paulin-Mohring  [1993]  as 

Eq  :  Ilfc :  Kind,  fc— >fc— s-  Kind 

Eq  =  Afc:  Kind.  A t:k.  Ind(fc' :  fc  — » Kind){fc' t} 

refl  :  Ilfc: Kind. lit: k.  Eq  ktt 
refl  =  Afc:  Kind.  A t:k.  Ctor  (1,  Eq  fc  t) 

and  its  elimination  allows  us  to  define  a  type  term  showing  this  is  actually 
Leibniz  equality: 

Leibniz  :  Ilfc :  Kind.  IK :  fc.  IK' :  fc.  Eq  fc  t t'  —> IIP :  fc— >  Kind.  P  t— *  P  t' 

By  this  definition  of  equality,  the  normal  form  of  a  term  representing  a  proof 
of  equality  between  closed  types  A  and  A'  is  an  application  of  the  constructor 
refl,  whose  kind  ensures  that  the  types  are  /by -equivalent.  The  expressiveness 
comes  from  the  possibility  to  construct  proofs  of  equality  using  case  analysis 
with  dependent  elimination  to  relate  different  normal  forms.  Consider  the 
following  example.  Proving  that  zero  is  a  left  unit  of  plus  is  trivial: 

leftUnit  :  lit: Nat.  Eq  Nat  t  (plus  zero  t) 
leftUnit  =  refl  Nat 

because  according  to  our  definition  of  plus  we  have  plus  zero  t>t.  Not  so  with 
proving  that  zero  is  a  right  unit  of  plus:  The  type  term  plus  t  zero  is  in  normal 
form  (assuming  plus  stands  for  the  elimination  term  of  TL  defined  in  user- 
friendly  form  in  Figure  2),  not  convertible  to  t.  However  it  is  possible  to  encode 
an  inductive  proof,  using  dependent  elimination  on  Nat: 

rightllnit  :  lit: Nat.  Eq  Nat  t  (plus  t  zero) 
rightUnit  zero  =  refl  Nat  zero 

rightllnit  (succ  t)  =  eqf  Nat  Nat  succ  t  (plus  t  zero)  (rightUnit  t) 

where 

eqf  :  nfcnKind.nfcLKind.IILfc— >fc/.IK:fc.IK,:fc.  Eq  fc  1 t'  —> Eq  fc'  (f  t)  (f  t') 
eqf  =  Afc:  Kind.  Afc' :  Kind.  Af :  fc— >fc'.  At :  fc.  A  t' :  fc.  Ap:  Eq  ktt' . 

Leibniz  ktt'  p  (A  t" :  fc.  Eq  k'  (f  t)  (f  t"))  (refl  k'  (f  t )) 

The  type  term  eqf  constructs  a  proof  of  equality  between  the  results  of  two  ap¬ 
plications  of  a  function,  given  a  proof  of  equality  between  the  arguments.  In 
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rightllnit  it  is  employed  to  obtain  from  the  inductive  hypothesis  (with  proof  rep¬ 
resented  by  rightllnit  t)  a  proof  of  Eq  Nat  (succ  t)  (succ  (plus  t  zero)),  which  by  the 
definition  of  plus  is  /^/.-equivalent  to  the  goal  Eq  Nat  (succ  t)  (plus  (succ  t)  zero). 
The  dependency  between  the  parameter  of  rightllnit  and  the  types  of  the  right- 
hand  side  branches  must  be  specified  using  A t :  Nat.  Eq  Nat  t  (plus  t  zero)  as  the 
second  parameter  of  the  Elim  term  in  the  unsugared  TL  definition  of  rightllnit; 
the  type  of  the  zero  branch  is  /^-equivalent  to  Eq  Nat  zero  (plus  zero  zero),  and 
that  of  the  succ  branch  with  parameter  t  is  Eq  Nat  (succ  t)  (plus  (succ  t)  zero). 

Returning  to  type  conversions  in  \H ,  suppose  now  that  we  have  a  vector  of 
length  plus  t\  t2,  while  a  function  we  want  to  apply  to  it  expects  a  vector  of 
length  plus  t2  U .  Let  us  define  the  proof-augmented  version  of  the  vector  type 
as  follows. 

veep  :  Nat^fi— 

veep  =  A t' :  Nat.  Afi  :0. 3t :  Nat.  3P:  Eq  Nat  t' t.  vec  1 1\ 

The  “old”  vectors  can  be  trivially  converted  to  the  new  type  by  giving  them  the 
same  size  they  had:  If  Vi  has  type  vec  A  B,  then 

(■ t=A ,  (P  =  refl  Nat  A,  vi:vec  A  B) 

:  3P:Eq  Nat  A  t.vec  t  B) 

has  type  veep  A  B.  Selection  from  these  vectors  can  be  performed  for  the  same 
index  expressions  as  for  the  corresponding  “old”  vectors — constructing  a  proof 
of  LT  A!  t  from  proofs  of  LT  A!  A  and  Eq  Nat  A  t  is  straightforward.  Conversion 
of  the  type  of  some  term  v  from  veep  (plus  t\  t2)  nat  to  veep  (plus  t2  fi)  nat  is  per¬ 
formed  by  the  expression 

open  v  as  ( t ,  v')  in  open  v'  as  (P,  v")  in 

(t  =  t, 

(P  =  eqTrans  Nat  (plus  t2  fi)  (plus  t\  t2)  t  (plusSym  t2  t\)  P, 
v" :  vec  t  nat) 

:  3P :  Eq  Nat  (plus  t2  ti)  t.  vec  t,  nat) 

where  eqTrans  is  a  proof  of  the  transitivity  of  equality 

eqTrans  :  life :  Kind.  lit  :k.Tlt”  :k.  Eq  k  t  if  — >  Eq  k  t' t"  — >Eq  k  1 1" 
eqTrans  =  A k :  Kind.  A t :  k.  At’ :  k.  At" :  k.  Ap :  Eq  k  1 1'. 

Ap' :  Eq  k  t' t" .  Leibniz  k  t’ t"  p'  (Eq  kt)  p 

and  plusSym  is  a  proof  of  the  symmetry  of  plus  (using  the  lemma  succPIus  proving 
that  Vn,  to  €  N.  (n  +  m)  +  1  =  n  +  (m  +  1)): 

plusSym  :  lit :  Nat.  lit' :  Nat.  Eq  Nat  (plus  1 1')  (plus  t' t) 

plusSym  zero  =  rightllnit 

plusSym  (succ  t)  =  At1 :  Nat.  eqTrans  Nat 

(plus  (succ  t)  t') 

(succ  (plus  t' t)) 

(plus  t'  (succ  t)) 

(eqf  Nat  Nat  succ  (plus  1 1')  (plus  t' t)  (plusSym  1 1')) 
(succPIus  t' t ) 
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succPIus  :  lit:  Nat.  lit'  :Nat.  Eq  Nat  (succ  (plus  1 t'))  (plus  t  (succ  t')) 
succPIus  zero  =  A  t' :  Nat.  refl  Nat  (succ  if) 

succPIus  (succ  t)  =  \t'  :Nat.  eqf  Nat  Nat  succ 

(succ  (plus  1 1')) 

(plus  t  (succ  t')) 

(succPIus  t  if) 


Similar  proof  terms  can  be  found,  among  many  other,  in  standard  proof  li¬ 
braries  ( e.g that  of  Coq  [Huet  et  al.  2000]). 

Due  to  the  explicit  use  of  proof  terms,  this  technique  for  support  of  type  con¬ 
versions  can  also  exploit  equivalences  which  are  valid  only  locally,  for  instance 
in  a  branch  of  a  term-level  conditional.  To  simplify  the  following  example,  let 
us  extend  the  computation  language  with  a  comparison  for  equality  between 
natural  numbers  with  the  obvious  semantics.1  In  the  following  example,  two 
vectors  of  unrelated  (in  general)  sizes  can  be  converted  to  the  same  type  if  they 
are  dynamically  determined  to  have  the  same  size. 

At :  Nat.  An :  snat(t) .  Av :  veep  t  nat. 

A  if :  Nat.  An'  :snat(t').  Av' :  veep  if  nat. 
if  [EqOrTrue  1 1',  eqPrf  t  if] 

(n  =  n', 

P.  . . .  open  v'  as  (ti,  x)  in  open  x  as  (Pi,  y)  in 

(t'2=ti,  (P2  =  eqTrans  Nat  1 1'  t\  P  P1;  y:vec  t\  nat),  . . . 

:  3  P2 :  Eq  Nat  1 t2  ■  vec  t2  nat) 

_....) 

where  EqOrTrue  and  eqPrf  are  the  analogues  of  LTOrTrue  and  ItPrf  from  Section  4.3. 
The  proof  of  Eq  Nat  1 t2,  bound  to  P2,  is  constructed  by  transitivity  from  the 
proof  of  Eq  Nat  t  if,  bound  to  P  by  the  conditional,  and  the  proof  of  Eq  Nat  if  t2, 
extracted  from  the  package  v'  and  bound  to  Pi.  As  a  result  the  type  of  the  open 
term,  which  is  a  repackaged  v',  is  veep  t  nat — the  type  of  v. 

Notice  that  all  terms  involved  in  the  type  conversions  have  no  computational 
overhead  and  will  be  eliminated  under  type-erasure  semantics;  we  emphasized 
this  fact  in  the  examples  by  placing  the  conversions  inline. 

As  with  the  kind  term  LT,  strictly  speaking  TL  does  not  allow  the  above  defi¬ 
nition  of  Eq,  but  its  Church  encoding  has  the  same  properties  for  our  purposes, 
since  we  do  not  need  dependent  or  large  elimination  of  equality  proof  terms  for 
the  proof  compositions  shown  here.  The  Church  encoding  of  the  equality  kind, 
its  “constructor,”  and  its  elimination  are  as  follows. 

Eq  =  A k :  Kind.  At :  k.  Xt' :  k.  IIP :  k ->  Kind.  P  t ->•  P  t' 

refl  =  Xk:  Kind.  At:  k.  AP:  k— >Kind.  Ap:  P  t.  p 
Leibniz  =  Xk:  Kind.  At :  k.  Xt' :k.  Ap:  Eq  k  t  if .  p 


1  Comparison  for  equality  can  be  derived  from  the  less-than  comparison  of  An;  we  will  also  need  a 
straightforward  to  define  proof  term  for  lit :  Nat.  fit' :  Nat.  Not  (LT  t  t ')  — >  Not  (LT  t' t)  — >  Eq  Nat  1 1'  or 
equivalent. 
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Clearly  there  are  opportunities  to  generalize  this  style  to  weaker  relations  of 
equivalence,  which  reveal  partial  information  about  the  hidden  type  parame¬ 
ters.  We  will  not  explore  this  topic  here. 

4.5  Type  safety 

The  type  safety  of  A h  is  a  corollary  of  its  properties  of  progress  and  subject  re¬ 
duction.  A  pivoting  element  in  proving  progress  (Lemma  4.3)  is  the  connection 
between  the  existence  of  a  proof  (type)  term  of  kind  LT  fh  n,  provided  by  rule 
E-SEL,  and  the  existence  of  a  (metalogical)  proof  of  the  side  condition  m  <  n, 
required  by  rule  R-SEL.  Similarly,  subject  reduction  (Lemma  4.5)  in  the  cases 
of  R-ADD  and  R-LT-T/F  relies  on  the  adequate  representation  of  addition  and 
comparison  by  plus  and  It. 

Lemma  4.1  (Adequacy  of  the  TL  representation  of  arithmetic) 

(1)  For  all  to,  n  €  N,  plus  fh  n  =pni.  m+n. 

(2)  For  all  to,  n  G  N,  It  to  n  =f3VL  true  if  and  only  if  to  <  n. 

(3)  For  all  to,  n  €  N,  to  <  n  if  and  only  if  there  exists  a  type  A  such  that 
•  h  A  :  LT  fh  n. 

Proof  sketch 

1:  By  induction  on  to  and  inspection  of  the  definition  of  plus. 

2:  By  induction  on  m  and  the  definition  of  le  (Figure  2);  for  the  forward  di¬ 
rection  the  auxiliary  inductive  hypothesis  is  that  for  all  n,  if  le  to  h,  then 

to  <  n. 

3:  For  the  forward  direction  it  suffices  to  observe  that  the  structure  of  the 
metalogical  proof  of  to  <  n  (in  terms  of  the  above  axioms  of  ordering)  can 
be  directly  reflected  in  a  type  term  of  kind  LT  to  n.  The  inverse  direction 
is  shown  by  examining  the  structure  of  closed  type  terms  of  this  kind  in 
normal  form.  □ 

We  also  need  a  guarantee  that  the  equivalence  of  constructor  applications 
implies  the  equivalence  of  the  constructors  and  their  arguments. 

Lemma  4.2  If  Ctor  (i,  I)  A  =pVi  Ctor  (*',  I')  A',  then  i  =  i' ,  I  =/3Vl  I',  and 
A  A! . 

Proof  sketch  A  corollary  of  the  confluence  of  TL  (Theorem  3.3).  □ 

Lemma  4.3  (Progress)  If  1—  e  :  A,  then  either  e  is  a  value,  or  there  exists 
e!  such  that  e^e'. 

Proof  sketch  By  standard  techniques  [Wright  and  Felleisen  1994]  using  in¬ 
duction  on  the  typing  derivation  for  e.  Due  to  the  transitivity  of  =pm  any 
derivation  of  A;  T  h  e  :  A  can  be  converted  to  a  standard  form  in  which  there 
is  an  application  of  rule  E-CONV  at  its  root,  whose  first  premise  ends  with  an 
instance  of  a  rule  other  than  E-CONV,  all  of  whose  term  derivation  premises 
are  in  standard  form. 
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The  interesting  case  is  that  of  the  dependently  typed  sel  construct. 

If  e  =  sel[A'](v,  v'),  by  inspection  of  the  typing  rules  the  derivation  of  L  e  :  A 
in  standard  form  must  have  an  instance  of  rule  E-SEL  in  the  premise  of  its 
root.  Hence  the  subderivation  for  v  must  assign  to  it  a  tuple  type,  and  the 
whole  derivation  has  the  form 

_ V _  V  _ £ _ 

v  :  tup  A2  A"  -;-h  v'  :  snat  Ai  •  h  A!  :  LT  Ai  A2 
•;-h  sel[A'](t>,  v')  :  A ”  A± 
sel[H'](r,  v')  :  A 


where  A  =pVi  A"  ^i-  By  inspection  of  the  typing  rules,  rules  other  than  E- 
CONV  assign  to  all  values  types  which  are  applications  of  constructors  of  0 . 
Since  the  derivation  V  is  in  standard  form,  it  ends  with  an  E-CONV,  in  the 
premise  of  which  another  rule  assigns  v  a  type  /^(-equivalent  to  tup  A2  A" . 
Then  by  Lemma  4.2  this  type  must  be  an  application  of  tup,  and  again  by  in¬ 
spection  the  only  rule  which  applies  is  E-TUP,  which  implies  v  =  (u0,  . . .  vn-\ ), 
and  the  derivation  V  must  have  the  form 


Vi  <  n 


V, 


*;•!-  Vi  :  A'{  i 


(v0,  . . .  Vn-i)  :  tup  n  A 


// 

1 


Also  by  Lemma  4.2  A2  =pVi  n.  Similarly  the  only  rule  assigning  to  a  value 
a  type  convertible  to  that  in  the  conclusion  of  V  is  E-NAT,  hence  A±  rh 
for  some  m  €  N,  and  v'  =  m.  Then,  by  adequacy  of  LT  (Lemma  4.1(3)),  the 
conclusion  of  £  implies  that  m  <  n.  Hence  by  rule  R-SEL  e  i— >  vm. 

The  other  cases  are  straightforward;  as  a  representative,  consider  e  =  e\  e2. 
If  e\  is  not  a  value,  then  by  inductive  hypothesis  e\  i->  e[,  therefore  e\  =  E\  {e-n  } 
and  e[  =  E\  { (  , }  for  some  evaluation  context  E\  and  redex  en  such  that  en 
e'n;  then  e  i— >  Eje^},  where  E  =  E\  e2.  The  subcase  when  e\  is  a  value,  but  e2 
is  not,  is  similar.  If  both  e\  and  e2  are  values,  then  the  typing  derivation  for  e 
ends  with  an  instance  of  rule  E-CONV  applied  to  a  derivation  with  an  instance 
of  E-APP  at  its  root,  where  a  derivation  for  e\  is  in  the  premise  for  the  subterm 
with  an  arrow  type.  Reasoning  as  in  the  case  for  sel  above,  since  e\  is  a  value 
and  only  rules  E-FUN  and  E-FIX  (again  excluding  E-CONV  due  to  the  standard 
form  of  the  derivation)  assign  an  arrow  type  to  a  value,  we  have  that  e\  must 
be  either  an  abstraction  or  a  fixpoint  (of  an  arrow  type).  Then  e  reduces  by  rule 
R-/3  or  R-FIX,  respectively,  with  the  empty  evaluation  context.  □ 

A  standard  type  substitution  lemma  is  used  in  the  proof  of  Subject  Reduction 
for  the  cases  of  redexes  with  type-level  parameters. 


Lemma  4.4  (Type  substitution)  If  A,  X :  B\  T  h  e  :  A!  and  Ah  A:  B,  then 

A;  [A/X]T  h  [A/X\e  :  [A/X\A’. 

Proof  sketch  By  induction  on  the  typing  derivation  for  e.  □ 


Lemma  4.5  (Subject  Reduction)  If  ;  h  e  :  A  and  e  i— >  e',  then  ;  h  e'  :  A. 
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Proof  sketch  Since  evaluation  contexts  bind  no  variables,  it  suffices  to  prove 
subject  reduction  for  and  use  a  standard  term  substitution  lemma.  We  show 
only  some  cases  of  redexes  involving  sel  and  if. 


— The  derivation  for  e  =  sel[H']((?;o,  •••  vn-i ),m)  in  standard  form  has  the 
shape 


Vi  <  n 


V, 


•;-b  Vi  :  A'l  i 


V 


•;-b  (v)  :  tup  n  A"  -;-b  m  :  snat  fh 


£ 


■;-b  ( v)  :  tup  A2  A"  -;-b  m  :  snat  A\  ■  b  A'  :  LT  A\  A2 
sel[A/]((a0,  . . .  vn_i),ra)  :  A"  Ai 


b  sel[i4']((t;o,  ...  vn-i),m)  :  A 


where  A  =pVL  A "  A\,  A”  =pVi  A",  and  A\  =pr]L  fh.  Since  e'  only  by  rule 
R-SEL,  we  have  m  <  n  and  e'  =  vm,  so  from  Vm  and  A"  fh  =pVi  A"  fh  =pVi 
A"  Ai  =pvl  A  we  obtain  a  derivation  of  -;-b  e!  :  A. 

— In  the  case  of  if  the  standard  derivation  V  of 


S'b  if  [B,  A'](tt,  X1.e1,X2.e2):A 

ends  with  an  instance  of  E-CONV,  preceded  by  an  instance  of  E-IF.  Using  the 
notation  from  Figure  5,  from  the  premises  of  this  rule  it  follows  that  we  have 
a  derivation  £  of  •  b  A'  :  B  A",  and  A"  =pVL  true  (since  rule  E-TRUE  assigns 
sbool  true  to  tt),  hence  we  have  •  b  A'  :  B  true  by  CONV.  By  Lemma  4.4  from 
£  and  the  derivation  of  X± :  B  true;  •  b  e\  :  A  (provided  as  another  premise), 
since  X\  is  not  free  in  A  (ensured  by  the  premise  •  b  A  :  fl)  we  obtain  a 
derivation  of -;-b  [A' /Xi)e\  :  A.  □ 


Theorem  4.6  (Safety  of  XH)  If  e  :  A,  then  either  e  i— >*  v  and  ;-b  v  :  A, 

or  e  diverges  (i.e.,  for  each  e',  if  e  i— >*  e7,  then  there  exists  e"  such  that  e'  i— >  e"). 
Proof  sketch  Follows  from  Lemmas  4.3  and  4.5.  □ 

4.6  Discussion 

The  proof  of  Progress  of  A h  relies  critically  on  the  adequacy  of  the  represen¬ 
tation  of  meta-proofs  of  natural  numbers  being  in  the  less-than  relation,  that 
is,  that  for  closed  A  and  B  the  kind  LT  .4  B  is  inhabited  if  and  only  if  A  and  B 
represent  natural  numbers  related  by  less-than.  In  the  case  of  the  less-than 
relation  and  LT  this  fact  was  proved  in  Lemma  4.1.  However,  it  must  be  kept  in 
mind  when  considering  extensions  of  A h  that  since  ClC  and  TL  are  more  ex¬ 
pressive  than  higher-order  predicate  logic,  adequacy  of  the  representations  of 
meta-proofs  does  not  hold  in  general,  hence  the  existence  of  a  term  of  the  kind 
of  the  proposition  does  not  imply  that  there  is  a  meta-proof  of  the  proposition. 
For  instance  the  ability  to  eliminate  inductive  kinds  in  TL  allows  analysis  of 
proof  derivations — a  technique  which  allows  the  construction  of  proof  terms 
without  counterpart  in  standard  meta-reasoning.  This  issue  does  not  arise  for 
first-order  proof  representations  (whose  constructors  have  no  parameters  of  a 
function  kind)  such  as  LT,  and  we  do  not  expect  it  to  be  a  concern  in  practice. 

ACM  Transactions  on  Programming  Languages  and  Systems,  Vol.  TBD,  No.  TDB,  Month  Year. 


A  Type  System  for  Certified  Binaries 


31 


In  cases  when  it  does  arise,  it  could  be  resolved  by  using  the  underlying  con¬ 
sistent  logic  of  ClC  in  place  of  the  meta-logic;  for  instance  in  our  presentation 
the  question  of  adequacy  is  raised  because  the  operational  semantics  of  A h  is 
defined  in  meta-logical  terms,  but  this  question  would  be  moot  if  A h  and  its 
semantics  were  defined  as  ClC  terms.  To  eliminate  the  interaction  with  the 
meta-logic,  this  approach  should  be  applied  all  the  way  down  to  the  hardware 
specification  (as  done  in  some  PCC  system  [Appel  and  Felty  2000]);  we  plan  to 
pursue  this  in  the  future. 

The  language  A h  is  intended  only  as  an  illustration  of  the  expressiveness 
of  type  systems  based  on  TL.  As  we  showed  in  Section  4.4,  type  conversions 
can  be  programmed  in  A//;  however,  it  is  also  easy  to  extend  \H  with  a  type 
conversion  construct  cast,  which  allows  conversion  between  any  types  which 
the  programmer  can  prove  are  in  a  given  relation  of  equivalence.  The  strongest 
such  equivalence  relation  in  TL  is  represented  by  Eq,  and  in  this  case  the 
typing  rule  for  cast  is 


A;  r  b  e  :  A  A  h  B  :  Eq  ft  A  A' 
A;  T  h  cast [A,A':B]e  :  A' 


(E-cast) 


The  dynamic  semantics  of  cast  is  trivial: 

cast [A,A',B]e  e  (R-CAST) 


The  proof  of  the  soundness  of  this  extension  is  based  on  the  observation 
(following  from  Theorem  3.3,  the  Church-Rosser  property  of  TL)  that  if  the 
judgment  •  b  B  :  Eq  t  >  .4  A'  is  derivable  (which  is  what  we  have  in  the  cor¬ 
responding  case  of  the  proof  of  Subject  Reduction),  then  the  normal  form  B' 
of  B  is  an  application  of  refl  to  some  kind  equivalent  to  ft  and  to  some  type 
A\.  But  the  kind  of  this  application  is  then  Eq  ft  Ai  A\,  while  the  kind  of  B' 
is  Eq  ft  A  A',  so  either  A  =  A\,  or  there  is  an  application  of  rule  CONV  in  the 
derivation  for  B1,  with  a  proof  of  A  =JgJ)t  A\  in  the  premise,  and  similarly  for 
A’  vs.  A\.  Thus  we  can  obtain  a  proof  that  A  =pvl  A',  and  the  rest  of  the 
meta-proof  is  the  same  as  for  E-CONV.2 

In  a  language  equipped  with  this  construct,  the  programmer  provides  the 
compiler  with  proofs  of  correctness  of  type  conversions,  which  legalizes  more 
conversions  than  in  any  decidable  type  system  with  a  built-in  notion  of  con¬ 
version.  Reusing  definitions  from  Section  4.4,  the  cast  from  snat(plus  1 1!)  to 
snat(plus  f t )  is 

cast[snat  (plus  1 1’), 
snat  (plus  t'  <), 

eqf  Nat  0  snat  (plus  1 11)  (plus  t!  t)  (plusSym  1 1')] 

e 


2Again,  this  proof  of  soundness  goes  through  with  either  an  inductive  definition  of  Eq,  as  in  ClC, 
or  with  its  Church  encoding,  since  no  large  or  dependent  elimination  of  proof  terms  is  used. 
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5.  CPS  CONVERSION 

In  this  section  we  show  how  to  perform  CPS  conversion  on  A h  while  still  pre¬ 
serving  proofs  represented  in  the  type  system.  This  stage  transforms  all  uncon¬ 
ditional  control  transfers,  including  function  invocation  and  return,  to  function 
calls  and  gives  explicit  names  to  all  intermediate  computations.  In  this  way, 
evaluation  order  is  explicit  and  there  is  no  need  for  a  control  stack. 

There  are  two  interesting  points  in  our  approach  to  CPS  conversion.  First, 
as  we  discuss  in  detail  later  in  this  section,  arbitrary  terms  of  the  type  lan¬ 
guage  that  appear  in  computation  terms  are  not  transformed.  Second,  the 
transformation  of  types  is  encoded  as  a  function  in  our  type  language  and,  as 
will  become  apparent  later  in  this  section,  this  fact  is  important  for  proving 
that  our  CPS  conversion  is  type-correct. 

We  start  by  defining  a  version  of  A h  using  type-annotated  terms.  By  /  and 
e  we  denote  the  terms  without  annotations.  Type  annotations  allow  us  to 
present  the  CPS  transformation  based  on  syntactic  instead  of  typing  deriva¬ 
tions. 

(exp)  e  ::=  eA 

e  ::=  x  \  n  |  tt  |  ff  |  /  |  fi  x  x:  A.  f  \  e  6  \  e[A]  |  (X  =  A,  e:A') 

|  open  e  as  ( X ,  x)  in  e!  |  (eo,  . . .  en~i )  |  sel[A](e,  e') 

|  e  aop  e’  |  e  cop  e!  |  if  [A,  A’](e,  X\.ei,  X2.  e2) 

(fun)  /  ::=  jA 

f  ::=  A x:A.e  \  AX :  A.  f 

We  call  the  target  calculus  for  this  phase  XK,  with  syntax: 

(val)  v  ::=  x  \  n  |  tt  |  ff  |  (X  =  A,  v.A1)  |  ( vo ,  . . .  vn-\) 

|  fi  x  f[X\  :Ai,  . .  ,Xn:An\(x:  A),  e 

(exp)  e  v[A\,  . . .  An](v’)  \  let  x  =  v  in  e  |  let  (X,  x)  =open  v  in  e 

|  let  x  =  sel[v4](v,  v')  in  e  |  let  x  =  v  aop  v'  in  e  |  let  x  =  v  cop  v'  in  e 
I  if  [A,  A'](v,  Xi.ei,  X2.  e2) 

Expressions  in  XK  consist  of  a  series  of  let  bindings  followed  by  a  function  ap¬ 
plication  or  a  conditional  branch.  There  is  only  one  abstraction  mechanism, 
fi  x,  which  combines  type  and  value  abstraction.  Multiple  arguments  may  be 
passed  by  packing  them  in  a  tuple.  We  use  the  following  syntactic  sugar  to 
denote  non-recursive  function  definitions  and  value  applications  in  AA'Chere  x' 
is  a  fresh  variable): 

A x\A.e  =  fix  a .^(x:A).e 
v  v'  =  vOlV) 

AXi-.Ai....AXn:An.\x-.A.e  =  fi  x  af  [X\ :  A±,  . . .  Xn :  An\(x-.  A),  e 

A k  shares  the  TL  type  language  with  XH ■  The  types  for  A k  all  have  kind  f 1K 
which,  as  in  A h,  is  an  inductive  kind  defined  in  TL.  The  fix  kind  has  all  the 
constructors  of  0  plus  one  more  (func).  Since  functions  in  CPS  do  not  return 
values,  the  function  type  constructor  of  fix  has  a  different  kind: 

— »  :  XI  x — >Cat 
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We  use  the  more  conventional  syntax  A— >_L  for  — »  A  (i.e.,  the  type  of  functions 
taking  a  parameter  of  type  A).  As  will  become  apparent  shortly  in  the  static  se¬ 
mantics  of  A k,  no  value  of  A k  has  type  A— >_L.  The  latter  is  used  in  conjunction 
with  the  new  constructor  func  to  form  the  types  of  function  values: 

func  :  fix — 

Every  function  value  is  implicitly  associated  with  a  closure  environment  (for  all 
the  free  variables),  so  the  func  constructor  is  useful  in  the  closure-conversion 
phase  (see  Section  6).  In  the  case  of  function  values,  the  type  parameter  of 
func  is  an  element  of  fix  constructed  by  application  of  — VKind  or  VKscm-  The 
func  constructor  allows  us  to  build  one  closure  for  each  polymorphic  function 
definition  (even  though  it  contains  both  type  abstraction  and  term  abstraction). 

In  the  static  semantics  of  \K  we  use  two  forms  of  judgments.  As  in  XH,  the 
judgment  A;  T  hK  v  :  A  indicates  that  the  value  v  is  well  formed  and  of  type 
A  in  the  type  and  value  contexts  A  and  I  respectively.  Moreover,  A:  i  \-K  e 
indicates  that  the  expression  e  is  well  formed  in  A  and  T.  In  both  forms  of 
judgments,  we  omit  the  subscript  from  hK  when  it  can  be  deduced  from  the 
context. 

The  static  semantics  of  A k  is  specified  by  the  formation  rules  in  Figure  6.  We 
omit  the  rules  for  environment  formation,  variables,  constants,  tuples,  pack¬ 
ages,  and  type  conversion  on  values,  which  are  the  same  as  in  XH,  and  we 
give  only  one  example  for  arithmetic  and  comparison  operators.  Except  for  the 
rules  K-FIX  and  K-APP,  which  must  take  into  account  the  presence  of  func,  the 
static  semantics  for  Xr  is  a  natural  consequence  of  the  static  semantics  for  A h- 

Typed  CPS  conversion  involves  the  translation  of  both  types  and  computa¬ 
tion  terms.  Earlier  algorithms  [Harper  and  Lillibridge  1993;  Morrisett  et  al. 
1998]  require  traversing  and  transforming  every  term  in  the  type  language 
(which  would  include  all  the  proofs  in  our  setting).  This  is  impractical  because 
proofs  are  large  in  size,  and  transforming  them  can  alter  their  meanings  and 
break  the  sharing  among  different  intermediate  languages. 

To  see  the  actual  problem,  let  us  convert  the  A h  expression  (X  =  A,  e  :  B) 
to  CPS,  assuming  that  it  has  type  3X :  A' .  B.  We  use  /Ctyp  to  denote  the  meta¬ 
level  translation  function  for  the  type  language  and  /Cexp  for  the  computation 
language.  Under  previous  algorithms,  the  translation  also  transforms  the  wit¬ 
ness  A: 

/Cexp[(A  =  A,  e:B)j  = 

Ak:/CtyppA:A'.i?]./CexP[e]  (Ax:/Ctyp[[A/A]H].  k  (X  =  /Ctyp[A],  a-:/Ctyp[B])) 

Here  we  CPS-convert  e  and  apply  it  to  a  continuation,  which  puts  the  result 
of  its  evaluation  in  a  package  and  hands  it  to  the  return  continuation  k.  With 
proper  definition  of  /Ctyp  and  assuming  that  /Ctyp  |  A"  j|  =  X  on  all  variables  X, 
we  can  show  that  the  two  types  /Ctyp[[A/A]H]  and  [/Ctyp|[A]/A](/Ctyp|H])  are 
equivalent  (under  = p r}l).  Thus  the  translation  preserves  typing. 

But  we  do  not  want  to  touch  the  witness  A,  so  the  translation  function  should 
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for  all  i  £  {1  ...  n}  A  h  A;  :  Si 

A,X1:A1...,Xn:An  h  A  :  Cl  A,X1:A1...,Xn:An-,T,x':A',x:Ai-  e 

A;  r  h  fix  afiXx  :  Al,  . . .  Xn  :An\(x:A).e  :  A' 
whereA'  =  func  (VsjXi :  Ai. . . .  VSnX„  :  An.  A— »±) 


(K-FIX) 


/or  all  i  6  {1 ...  n}  AKA;:  fij 

A;  r  h  o' :  func(VSl  AT!  :B\. . . .  VSnXn  :  A— »±)  A;  T  h  u  :  [Ai/Xi] . . .  [An/Xn]A  (K-APP) 

A;  T  h  r'[Ai,  ...  A„](v) 


A;  r  h  v  :  A  A:  r,x:A  e 
A;  F  h  let  x  =  t)  in  e 

A;  T  h  t)  :  tup  A"  B  A:  T  h  r'  :  snat  A'  A  h  A  :  LT  A'  A"  A;  F,x:B  A'  h  e 
A;  T  h  let  x  =  sel[A](r,  v')  in  e 


(K-val) 


(K-sel) 


A;  T  h  v  :  3 gY-.B.A  A ,X:B;  Y,x:[X/Y]A  h  e 

A;  F  h  let  { X ,  x)  =  open  v  in  e 


X  £  A 
s  /  Ext 


(K-OPEN) 


A;  F  h  :  snat  A  A;  T  h  o'  :  snat  A'  A;  T,  a; :  snat  (plus  A  A')  h  e 
A;  T  h  let  x  =  v  +  v'  in  e 

A;  T  h  v  :  snat  A  A;  T  h  t/  :  snat  A'  A;  r,  a: :  sbool  (It  A  A')  h  e 
A;  T  h  let  x  =  v  <  v'  in  e 


(K-add) 


(K-lt) 


Ah  B  :  Bool  — >  Kind  A  h  A  :  B  A1  A;  T  h  r  :  sbool  A' 

A,  X\  :  B  true;  Y  h  e\  A,  X2  :  B  false;  T  h  e 2  (K-IF) 

A;  r  h  if  [B ,  A](r,  Xj.  ei,  X2.e2) 


Fig.  6.  Static  semantics  of  A  K. 


be  defined  as  follows: 

ICexpl{X  =  A,  e-.B)\  = 

Ak:/CtyppX:hl'.B]./Cexp[e]  (Ax:/Ctyp[[A/X]B].  k  {X  =  A,  x:/Ctyp[B ])) 

To  preserve  typing,  we  have  to  make  sure  that  the  two  types  K-tyv\[A/ X]B\ 
and  [A/ X)(lCtyV\B\)  are  equivalent.  This  seems  impossible  to  achieve  if  /Ctyp  is 
defined  at  the  meta  level. 

Our  solution  is  to  internalize  the  definition  of  /Ctyp  in  our  type  language.  We 
replace  /Ctyp  by  a  type  function  K  of  kind  0  — >  f lK.  For  readability,  we  use  the 
pattern-matching  syntax,  but  it  can  be  easily  coded  using  the  Elim  construct. 


K  (snat  t) 

K  (sbool  t) 

K  {h  -►  t2) 

K  (tup  ti  f2) 
K  (VKind  k  t) 
K  (3  Kind  k  t) 

k  (VKscm  Z  t) 
K  (3Kscm  Z  t) 


=  snat  t 
=  sbool  f 

=  func  ((K(ti)  x  Kc(t2))->X) 

=  tup  ti  (A t :  Nat.  K (i2  i)) 

=  func  (VKind  k  (Ati  :fc.  Kc(i  ti)->-L)) 
=  3 Kind  k  (Ati :  k.  K(i  ii)) 

=  func  (VKscm  -z  (Xk:z.  Kc(i  fc)— >-L)) 
=  3Kscm  -z  (Xk:z.  K(i  A;)) 
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where 


Kc  =  ACO.func  (K(t)->_L). 

The  definition  of  K  is  in  the  spirit  of  the  interp  function  of  Crary  and  Weirich 
[1999].  However  interp  cannot  be  used  in  defining  a  similar  CPS  conversion, 
because  its  domain  does  not  cover  (nor  is  there  an  injection  to  it  from)  all  types 
appearing  in  type  annotations.  In  XH  these  types  are  in  the  inductive  kind  fi 
and  can  be  analyzed  by  K.  We  can  now  prove  K  (\A/X]B)  =/V  [A/X}(K(B))by 
first  reducing  B  to  its  normal  form  B' .  Clearly,  K  {[A/X]B)  =i),v  K  \\A/X]B') 
and  [A/X}{ K  ( B '))  =^1.  [A/X}{ K  (f?)).  Finally,  we  can  show  the  equivalence 
K  {{A/X\B')  =pr/i  [A/X](K  {B'))  by  induction  over  the  structure  of  the  normal 
form  B' . 

The  definition  of  the  CPS  transformation  for  computation  terms  of  A  h  to 
computation  terms  of  A k  is  given  in  Figure  7.  As  an  example  of  how  CPS 
conversion  works,  let  us  consider  the  transformation  of  function  abstraction 
(A x:A.e).  The  result  is  a  function  value  that  takes  as  a  parameter  a  pair  xarg, 
consisting  of  the  original  abstraction’s  parameter  x  and  the  current  continua¬ 
tion  k.  After  accessing  the  two  elements  of  this  pair,  the  function  value  applies 
the  CPS  conversion  of  the  abstraction’s  body  to  k.  On  the  other  hand,  the  trans¬ 
formation  of  a  function  application  (ei  e2)  gives  a  function  value  that  takes  as 
a  parameter  the  current  continuation  k.  By  applying  the  CPS  conversions  of  e\ 
and  e2  to  appropriate  continuations,  this  function  value  ultimately  applies  the 
function  corresponding  to  e\  to  a  pair  consisting  of  the  value  corresponding  to 
e2  and  the  continuation  k. 

The  following  proposition  states  that  our  CPS  conversion  preserves  typing. 
As  we  discussed  earlier,  it  is  important  for  its  proof  that  K  has  been  encoded  as 
a  function  in  TL. 

Proposition  5.1  (Type  Correctness  of  CPS  Conversion) 

If  e  :  A,  then  -;-hK  :  func  (KC(A) -U.). 

Proof  sketch  By  induction  on  the  typing  derivation  for  e.  □ 

6.  CLOSURE  CONVERSION 

In  this  section  we  address  the  issue  of  how  to  make  closures  explicit  for  all 
the  CPS  terms  in  A k-  This  stage  rewrites  all  functions  so  that  they  contain 
no  free  variables.  Any  variables  that  appear  free  in  a  function  value  are  pack¬ 
aged  in  an  environment,  which  together  with  the  closed  code  of  the  function 
form  a  closure.  When  a  function  is  applied,  the  closed  code  and  the  environ¬ 
ment  are  extracted  from  the  closure  and  then  the  closed  code  is  called  with  the 
environment  as  an  additional  parameter. 

Our  approach  to  closure  conversion  is  based  on  Morrisett  et  al.  [Morrisett 
et  al.  1998],  who  adopt  a  type-erasure  interpretation  of  polymorphism.  We  use 
the  same  idea  for  existential  types.  As  in  the  case  of  CPS  conversion,  there 
are  again  two  interesting  points  in  our  approach.  Arbitrary  terms  of  the  type 
language  that  appear  in  computation  terms  are  not  transformed.  Moreover, 
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£fval[I(Ax:A.es)A^s] 
^a1[(AXA./fl)V»^.B] 
A^exp  d^'4  ] 

ICexp  HZ'4] 

A^expd  (fi  X  X  :  A.  fA)A  ]] 
KexP[(eiA^B  e2A)Bfl 

^Cexp[(ev*  a'b[A])ba] 
ICex p[(eA°,  ...e^1^] 


Axarg:K (A)  X  KC(B). 

let  i  =  sel[ltPrf  0  2](xarg,  0)  in  let  k  =  sel[ltPrf  1  2](xarg,  1)  in  Xex P[eB]]  k 
AX:A.Ak:Kc(B).k(X,val[I/s]) 

Ak :  Kc  (A) .  k  (e)  for  eA  one  of  xA ,  nsnat  a ,  ttsb°o1  true ,  ffsb°o1  false 
Ak:Kc(A).k(/Cfval[/A)] 

Ak:  KC(A).  k  (fi  x  x[](k:  14(A)).  k  (/CfvalC/A ])) 

Ak:Kc(B). 

AfexP|IeiA^sl  (Axi :K(A  — >  B).  XexP[Ie2A]]  (Ax2  :K(A).  x\  (x2,  k>)) 
Ak:Kc(B  A).Kex p[ev«  A'  B]  (Ax:K(Vs  A'  B).x[A](k)) 

Ak:  KC(A). 

^CexpIeA«]  (Ax0:K(A0). 


/Cexp[sel[A](eituP  A"  B, e2snat  a')B  a' j 


^Cexpje^i1!  (Axn_i  :K(A„_i).k  (x0,  ■  ..xB-i»  •  •  •) 
Ak:Kc(B  A').Kiexp[IeituP  A"  s]  (Axi:K(tup  A"  B). 

KexP[e2sna‘A'j  (Ax2  :  K(snat  A'). 

let  x'  =  sel[A](xi,  x2)  in  k  x')) 


Afexp|I(A'  =  A,  elA/xlB:B>A']  =  Ak:  KC(A').  K.e^{A/X]B\  (Ax :  K([A/X]B).  k  (X  =  A,  x:K(B)» 

/CexP|[(open  ei3syA'  B  as  (X,  x)  in  e2A)AJ  =  Ak:  Ke(A).  Xexp[ei3sY;A  •  s]  (Axi :  K(3SY :  A'.  B). 

let  ( X ,  x)  =  open  xi  in  XexP[Ie2AJ  k) 

Xexp[(eisnat  A  +e2snat  A')snat  (P|US  A  A')J  = 

Ak:  Kc(snat  (plus  A  A')).  Xexp[eisnat  A]  (Axi :  K(snat  A). 

?Cexp[e2snat  A'  ]  (Ax2  :  K(snat  A'), 
let  x'  =xi  +X2  in  k  x')) 


Xexp[(eisnat  A  <e2snat  A' )sb°ol  (It  A  a')j 


Ak:  Kc(sbool  (It  A  A')).  Xexp|leisnat  AJ  (Axi :  K(snat  A). 

Kexp[e2snat  A'  ]  (Ax2  :  K(snat  A'), 
let  x'  =xi  <x2  in  k  x')) 


^CexP[(if  [B,  A](esb°o1  A",  Xi.eiA',  X2.e2A'))A'l  = 

Ak:  Kc(A').  XexP[esb°o1  A"  J  (Ax:  K(sbool  A"). 

if  [B,  A](x,  Xi.Xexp|IeiA']  k,  X2.XeXp|Ie2A']  k)) 


Fig.  7.  CPS  conversion:  from  A/f  to  Ak- 


the  transformation  of  types  is  again  encoded  as  a  function  in  our  type  language 
and  this  is  crucial  for  proving  that  closure  conversion  is  type-correct. 

We  call  the  language  we  use  for  this  phase  A c\  its  syntax  is: 

( val )  v  ::=  x  \  n  \  tt  |  ff  |  fi  x  d[Xi  :A\,  . . .  Xn :  An\ (x :  A) .  e  \  v[A\ 

|  (v0,  ■  ■  ■  v„-i)  |  (X=A,  v.A!) 

(exp)  e  ::=  v  v'  |  let  x  =  v  in  e  |  let  x  =  sel[j4](u,  v')  in  e  |  let  ( X ,  x)  =  open  v  in  e 
|  let  x  =  v  aop  v'  in  e  |  let  x  =  v  cop  v'  in  e  |  if  [ B ,  A\(v,  X\.e\,  X2.  e2) 

A c  is  similar  to  A k,  the  main  difference  being  that  type  application  and  value 
application  are  again  separate.  Type  applications  are  values  in  A c  reflecting 
the  fact  that  they  have  no  runtime  effect  in  a  type-erasure  interpretation.  We 
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use  the  same  kind  of  types  AlK  as  in  XK. 

The  main  difference  in  the  static  semantics  between  \K  and  \q  is  that  in 
the  latter  the  body  of  a  function  must  not  contain  free  type  or  term  variables. 
This  is  formalized  in  the  rule  C-FIX  below.  The  rules  C-TAPP  and  C-APP  corre¬ 
sponding  to  the  separate  type  and  value  application  in  A c  are  standard. 


for  all  i  <  n  ■  h  Aj  :  s.j 

■,  Xi :  A\ . . . ,  Xn :  An  h  A  :  fi  •,  X\ :  A\  . . . ,  Xn :  An ;  •,  x' :  B,  x :  A  h  e 

A;  T  h  fi  x  f[Xi:Ai,  . . .  Xn:An](x:A).e  :  B 
where  B  =  VSlAi  :Ai \/SnXn:  An.  A— >_L 


(C-FIX) 


A;  T  h  v  :  VsX-.A'.B  A  h  A  :  A' 
A;  T  h  v[A]  :  [A/X\B 


(C-TAPP) 


A;  T  h  v\  :  A— >_ L  A;  T  h  i>2  ■  A 

A;  T  h  Vi  V2 


(C-APP) 


We  define  the  transformation  of  types  as  a  function  Cl :  Alx  — *  fi k  — » 1  k ,  the 
second  argument  of  which  represents  the  type  of  the  closure  environment.  As 
in  CPS  conversion,  we  write  Cl  as  a  TL  function  so  that  the  closure-conversion 
algorithm  does  not  have  to  traverse  proofs  represented  in  the  type  system. 


Cl  (snat  t) 

Cl  (sbool  t) 

Cl  (t  — >_L) 

Cl  (func  t) 

Cl  (tup  tl  t2) 
Cl  (V Kind  k  t) 
Cl  (3Kind  k  t) 

Cl  (Vksciti  Z  t) 

Cl  (^Kscm  Z  if) 


At' :  VLk  ■  snat  t 

At'  :AlK-  sbool  t 

A t' :  AIk-  (t'  x  Cl  ( t )  ±)  -^_L 

A t' :  AIk-  3fi :  AIk ■  (Cl  ( t )  t\  x  t\) 

Xt'  :flA'-tup  t\  (At": Nat.  Cl  ( t2  t")  t') 
At'  :AIk-  Vwnd  k  (Xt\:k.C\  ( 1 1±)  t') 

X  tf  :AIk-  3  Kind  k  (Ati :  k.  Cl  (t  t\)  t') 
Xt'  \AIk-  VKind  2  {Xk\z.O  (■ t  k)  t') 

X tf  :AIk-  3«scm  ^  (A k:z.  Cl  (t  k)  t ') 


The  definition  of  the  closure  transformation  for  the  computation  terms  of  A  k 
is  given  in  Figure  8.  To  understand  how  closure  conversion  works,  let  us  again 
consider  the  transformations  of  function  abstraction  and  function  application. 
The  former  is  the  heart  of  closure  conversion  and  clearly  the  most  involved 
case.  A  Aa'  term  of  the  form  fi  x  af[Ai :  A±,  . . .  Xn :  An\(x:  A),  e  is  transformed  to  a 
package  (X  =  Aenv,  (rcode[Fi]  •  ■  ■  \Ym],  venv)  ■  Ax).  The  first  part  of  this  package 
is  the  type  of  the  closure  environment  Aenv.  The  second  part  is  a  pair  consisting 
of  the  transformed  function  body  i>code[Fi]  •  •  ■  [Ym]  and  the  closure  environment 
«env  The  closure  environment  is  a  tuple  containing  the  values  of  all  term 
variables  x0,  ■  ■  ■  Xk-i  that  are  free  in  e.  On  the  other  hand,  the  transformed 
function  body  takes  as  parameters:  (i)  all  type  variables  Yi,  . .  .Ym  that  are 
free  in  e,  (ii)  the  type  parameters  A\,  ...  Xn  of  the  original  function,  and  (iii)  a 
pair  xarg  containing  the  closure  environment  xenv  and  the  term  parameter  x 
of  the  original  function.  From  the  transformation  of  function  abstractions,  one 
immediately  notices  that  quantification  over  kind  schemas  is  required:  the 
definition  of  A'x  uses  VKscm  if  At  :  Kscm. 
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CvaiH  =  v,  for  v  one  of  x,  n,  tt,  ff 

^-vallK^Oi  •••  ^71—1)1  —  (f-val I ^’0 J )  •  ■  •  Cvald^n— 1  J) 

CwaX\(X  =  A,  w:B)l  =  (A  =  A,  Cval[«l:CI  (. B )  J_> 

^•vall^  *  -f  [Al  :  Al ,  ...  Xn  :  An]  ( X  A).  e]]  =  (A'  =  Aenv,  (^code  [A"l]  •  •  •  [^m]  7  ^env)  :  Ax ) 

where 

Ax  =  A!  x  x  X 

A'x  =VSlX1:A1....VSnXn:An.(X  x  Cl  (A)  _L)->± 
{xf°,...x^-1}  =  FV(e)-{x,  x'} 

{Yf\  . .  .Y*'m}  = 

FTV (fix  s![X1:A1,  . . .  Xn :  An](x :  A) . e) 
•^env  =  Cl  (tup  k  (nth  (A' o"  . . .  ::nil)))  _L 

^env  =:  (^O  •  •  •  %k  —  l) 

^code  =  fi  X  1fix  [Yi  ■  B 1 ,  .  .  .  Ym  •  B m ,  X\  \  A\ ,  . . .  Xn  •  An\ 
(^arg  •  -^env  xCI(A)_±). 
let  xenv  =sel[ltPrf  0  2](a:arg,0)  in 
let  a- =  sel[ltPrf  1  2](a;arg,  1)  in 
let  x'  =  (X  =  Aenv, 

(^fix  [Al]  .  .  .  [Am] ,  Xenv)  •  Ax )  in 
let  X(j  =  sel[ltPrf  0  fc](ienv,  0)  in  ... 
let  Xk-i  =sel[ltPrf  k  —  1  k](xe nv,  k  —  1)  in 

Cexp  [[  e  ] 


Cexp|^l[Al,  .  .  .  An]  (f2)l 


Cexp  I  let  x  =  v  in  e] 

Cexpllet  a:  =  sel[A](^,  -m')  in  e] 

Cexp  1 1st  (X,  x)  =open  v  in  ej 
CeXp[let  x  =  v\  +v2  in  ej 
Cexpllet  x  =  vi  <v2  in  ej 
Cexpjif  [B,  A](u,  Xi.ei,  A2.e2>l 


=  let  (Aenv,  larg)  =  open  Cvai[ri]  in 
let  xcode  =  selfltPrf  0  2](a;arg,  0)  in 
let  xenv  =  sel[ltPrf  1  2](a:arg,  1)  in 
■^codet^l]  •  •  •  [An]  (^env  ,  Cval  \V2  1 ) 

=  let  Z=CvalM  in  CexpH 
=  let  x  =  sel[A](CvalIvl,Cval[[i)'])  in  Cexp[[e] 

=  let  (A,  x)  =  open  Cvai[i;]]  in  CexpM 
=  let  x  =  Cvai [i>i  1  +  Cval [ V2 ]  inCeXp[[e] 

=  let  x  =  Cvai [ta  ]  < Cval [ V2 ]  inCexP[[e] 

=  if  IB,  A](Cvai[[i)]],  Ai.  Cexp  I  ei  ] ,  A2-  Cexp  [02 1) 


Fig.  8.  Closure  conversion:  from  A k  to  A c- 


Inversely,  the  transformation  of  function  application  opens  the  package  and 
reveals  the  type  Aem,  and  value  £env  of  the  closure  environment,  as  well  as  the 
function’s  body  atcode-  It  then  applies  the  body  to  the  actual  parameters  and  to 

Xenv* 

The  following  proposition  states  that  our  closure  conversion  preserves  typ¬ 
ing.  As  in  the  case  of  CPS  conversion,  the  fact  that  Cl  has  been  encoded  as  a 
function  in  TL  is  important  for  its  proof. 

Proposition  6.1  (Type  Correctness  of  Closure  Conversion) 

If  ~K  v  :  A,  then  c  Cvai[[w]  :  Cl  (A)  _L. 

Proof  sketch  By  induction  on  the  typing  derivation  for  v.  □ 
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7.  RELATED  WORK 

Our  type  language  is  a  variant  of  the  calculus  of  constructions  [Coquand  and 
Huet  1988]  extended  with  inductive  definitions  (with  both  small  and  large 
elimination)  [Paulin-Mohring  1993;  Werner  1994].  We  omitted  parameterized 
inductive  kinds  and  dependent  large  elimination  to  simplify  our  presentation, 
however,  all  our  meta-theoretic  proofs  carry  over  to  a  language  that  includes 
them.  We  support  77-reduction  in  our  language  while  the  official  Coq  system 
does  not.  The  proofs  for  the  properties  of  TL  are  adapted  from  Geuvers  [1993] 
and  Werner  [1994]  (which  in  turn  borrows  ideas  from  Altenkirch  [1993]);  the 
main  difference  is  that  our  language  has  kind-schema  variables  and  a  new 
product  formation  rule  (Ext,  Kind)  which  are  not  in  Werner’s  system. 

The  Coq  proof  assistant  provides  support  for  extracting  programs  from  proofs 
[Paulin-Mohring  1993].  It  separates  propositions  and  sets  into  two  distinct 
universes  Prop  and  Set.  We  do  not  distinguish  between  them  because  we  are 
not  aiming  to  extract  programs  from  our  proofs,  instead,  we  are  using  proofs 
as  specifications  for  our  computation  terms. 

Burstall  and  McKinna  [1991]  proposed  the  notion  of  deliverables,  which  is 
essentially  the  same  as  our  notion  of  certified  binaries.  They  use  dependent 
strong  sums  to  model  each  deliverable  and  give  its  categorical  semantics.  Their 
work  does  not  support  programs  with  effects  and  has  all  the  problems  men¬ 
tioned  in  Section  2.3. 

Xi  and  Pfenning’s  DML  [Xi  and  Pfenning  1999]  is  the  first  language  that 
nicely  combines  dependent  types  with  programs  that  may  involve  effects.  Our 
ideas  of  using  singleton  types  and  lifting  the  level  of  the  proof  language  are  di¬ 
rectly  inspired  by  their  work.  DML  does  not  support  explicit  proofs  in  its  type 
language;  any  assertions  (or  constraints)  must  be  resolved  fully  automatically 
in  order  to  ensure  decidable  typechecking.  As  a  result,  DML’s  assertion  lan¬ 
guage  only  allows  integer  linear  inequalities.  Our  system,  on  the  other  hand, 
allows  arbitrary  propositions  and  proofs.  An  assertion  in  our  system  can  use 
any  integer  constraints  but  a  certified  program  must  explicitly  provide  proofs 
on  how  these  constraints  are  satisfied.  Our  system  is  best  suited  for  use  in 
compiler  typed  intermediate  languages  while  the  DML  type  system  is  more 
suitable  for  use  in  a  source  programming  language.  Another  difference  is  that 
DML  does  not  define  the  O  kind  as  an  inductive  definition  so  it  does  not  sup¬ 
port  intensional  type  analysis  [Trifonov  et  al.  2000]  and  it  is  unclear  how  it  can 
preserve  proofs  during  compilation. 

We  have  discussed  the  relationship  between  our  work  and  those  on  PCC, 
typed  assembly  languages,  and  intensional  type  analysis  in  Section  1.  Induc¬ 
tive  definitions  subsume  and  generalize  earlier  systems  on  intensional  type 
analysis  [Harper  and  Morrisett  1995;  Crary  and  Weirich  1999;  Trifonov  et  al. 
2000];  the  type-analysis  construct  in  the  computation  language  can  be  elimi¬ 
nated  using  the  technique  proposed  by  Crary  et  al.  [1998]. 

The  work  presented  in  this  paper  showed  one  way  of  having  types  and  proofs 
coexist  in  an  intermediate  language  for  certified  binaries,  that  is,  by  embed¬ 
ding  predicates  and  proofs  directly  into  types.  Another  possibility,  which  we 
did  not  address,  is  to  embed  types  into  the  logic  which  proofs  are  carried  out — 
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essentially  using  pre-  and  post-conditions  as  in  Hoare  logic  to  express  type 
invariants.  Unfortunately,  Hoare  logic  does  not  work  well  with  higher-order 
functions,  for  example,  it  is  unclear  how  to  describe  an  assertion  that  a  for¬ 
mal  parameter  (of  another  function)  has  a  function  type  (as  simple  as  int— ant). 
Foundational  PCC  [Appel  and  Felty  2000]  requires  explicit  construction  of 
the  fixed  point  (using  index-based  semantic  model)  to  support  higher-order 
functions — which  is  probably  too  complex  for  compiler  intermediate  languages. 

Concurrently  with  our  work,  Crary  and  Vanderwaart  [2001]  recently  pro¬ 
posed  a  system  called  LTT,  which  also  aims  at  adding  explicit  proofs  to  typed 
intermediate  languages.  LTT  uses  Linear  LF  [Cervesato  and  Pfenning  1996] 
as  its  proof  language.  It  shares  some  similarities  with  our  system  in  that  both 
are  using  singleton  types  [Xi  and  Pfenning  1999]  to  circumvent  the  problems 
of  dependent  types.  However,  since  LF  does  not  have  inductive  definitions  and 
the  Elim  construct,  it  is  unclear  how  LTT  can  support  intensional  type  analy¬ 
sis  and  type-level  primitive  recursive  functions  [Crary  and  Weirich  2000].  In 
fact,  to  define  O  as  an  inductive  kind  [Trifonov  et  al.  2000],  LTT  would  have 
to  add  proof-kind  variables  and  proof-kind  polymorphism,  which  could  signifi¬ 
cantly  complicate  the  meta-theory  of  its  proof  language.  LTT  requires  different 
type  languages  for  different  intermediate  languages;  it  is  unclear  whether  it 
can  preserve  proofs  during  CPS  and  closure  conversion.  The  power  of  linear 
reasoning  in  LTT  is  desirable  for  tracking  ephemeral  properties  that  hold  only 
for  certain  program  states;  we  are  working  on  adding  such  support  into  our 
framework. 

8.  CONCLUSIONS 

We  presented  a  general  framework  for  explicitly  representing  propositions  and 
proofs  in  typed  intermediate  or  assembly  languages.  We  showed  how  to  inte¬ 
grate  an  entire  proof  system  into  our  type  language  and  how  to  perform  CPS 
and  closure  conversion  while  still  preserving  proofs  represented  in  the  type 
system.  Our  work  is  a  first  step  toward  the  goal  of  building  realistic  infras¬ 
tructure  for  certified  programming  and  certifying  compilation. 

Our  type  system  is  fairly  concise  and  simple  with  respect  to  the  number  of 
syntactic  constructs,  yet  it  is  powerful  enough  to  express  all  the  propositions 
and  proofs  in  the  higher-order  predicate  logic  (extended  with  induction  princi¬ 
ples).  In  the  future,  we  would  like  to  use  our  type  system  to  express  advanced 
program  invariants  such  as  those  involved  in  low-level  mutable  recursive  data 
structures. 

Our  type  language  is  not  designed  around  any  particular  programming  lan¬ 
guage.  We  can  use  it  to  typecheck  as  many  different  computation  languages 
as  we  like;  all  we  need  is  to  define  the  corresponding  O  kind  as  an  inductive 
definition.  We  hope  to  evolve  our  framework  into  a  realistic  typed  common 
intermediate  format. 

APPENDIX 

In  this  appendix  we  supply  the  rest  of  the  details  involved  in  the  formalization 
of  our  type  language  TL. 
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A.  FORMALIZATION  OF  TL 

Most  of  our  notation  and  definitions  are  directly  borrowed  from  Werner  [1994]. 
In  addition  to  the  symbols  defined  in  the  syntax,  we  will  also  use  C  to  denote 
general  terms,  Y  and  Z  for  variables,  and  I  for  inductive  definitions. 

To  ensure  that  the  interpretation  of  inductive  definitions  remains  consistent 
and  they  can  be  interpreted  as  terms  closed  under  their  introduction  rules, 
we  impose  positivity  constraints  on  the  constructors  of  an  inductive  definition. 
The  positivity  constraints  are  defined  in  Definitions  A.l  and  A.2. 

Definition  A.1  A  term  A  is  strictly  positive  in  X  if  A  is  either  X  or  IIF :  B.  A', 
where  A!  is  strictly  positive  in  X,  X  does  not  occur  free  in  B,  and  I/T. 

Definition  A.2  A  term  C  is  a  well-formed  constructor  kind  for  X  (written 
wfcx(C))  if  it  has  one  of  the  following  forms: 

(1)  X; 

(2)  IIY :  B.  C’ ,  where  is  not  free  in  B,  and  C'  is  a  well-formed 

constructor  kind  for  X;  or 

(3)  B'  —>  C' ,  where  B'  is  strictly  positive  in  X  and  C'  is  a  well-formed 
constructor  kind  for  X. 

Note  that  in  the  definition  of  wfcx(C)  the  second  clause  covers  the  case  when 
C  is  of  the  form  B  — >  C  and  X  does  not  occur  free  in  B.  Therefore,  we  only 
allow  the  occurrence  of  X  in  the  non-dependent  case. 

In  the  rest  of  this  paper  we  often  write  well-formed  constructor  kinds  for  X 
as  II Y :  B.  X.  We  also  denote  terms  that  are  strictly  positive  in  X  by  FIT' :  B.  X, 
where  X  is  not  free  in  B. 

Definition  A.3  Let  C  be  a  well-formed  constructor  kind  for  X.  Then  C  is  of 
the  form  II Y  :B.X.  If  all  the  Y’s  are  t’s,  that  is,  C  is  of  the  form  Ilf:  B.  X,  then 
we  say  that  C  is  a  small  constructor  kind  (or  just  a  small  constructor  when 
there  is  no  ambiguity)  and  denote  it  as  small (C). 

Our  inductive  definitions  reside  in  Kind,  whereas  a  small  constructor  does  not 
make  universal  quantification  over  objects  of  type  Kind.  Therefore,  an  inductive 
definition  with  small  constructors  is  a  predicative  definition.  While  dealing 
with  impredicative  inductive  definitions,  we  must  forbid  projections  on  uni¬ 
verses  equal  to  or  bigger  than  the  one  inhabited  by  the  definition.  In  par¬ 
ticular,  we  restrict  large  elimination  to  inductive  definitions  with  only  small 
constructors. 

Next,  we  define  the  set  of  reductions  on  our  terms.  The  definition  of  (3-  and 
//-reduction  is  standard.  The  /-reduction  defines  primitive  recursion  over  in¬ 
ductive  objects. 

Definition  A.4  Let  C  be  a  well-formed  constructor  kind  for  X  and  let  A,  B' , 
and  /  be  terms.  We  define  x.i A)  inductively  on  the  structure  of  C: 
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®x,i,b>{X,A)  ==  A 

$x,i,B'(nY:B.C',A)  d=  A  Y:B.$x,i,b'(C',AY) 

$x,i,B'((nY:B.X)->C',A) 

AZ:(IIY:B.I).<f>x,I,B'(C,,AZ  (A Y:B.B'  (Z  Yj)) 

Definition  A.5  The  reduction  relations  on  our  terms  are  defined  as: 

(A X:A.B)A'  ^  [A'/X]B 
XX  :A.  (B  X)  B,  if  X£FV{B) 
Elim[7,A"](Ctor(i,7)  A){R}  ^  (*x,i,B>(Ci,Bi))  A 
where  7  =  lnd(A :  Kind){C} 

B'  =  XY:I.{E\\m[I,A"](Y){B}) 

Recall  that  in  Section  3.2  we  introduced  the  relations  >p,  t>v,  and  \>L  as  the 
contextual  closures  of  the  relations  and  respectively;  we  write 

and  >  for  the  unions  of  the  above  relations,  and  =pVL  for  the  reflexive,  symmet¬ 
ric,  and  transitive  closure  of  \>. 

Let  us  examine  the  /.-reduction  in  detail.  In  Elim[7,  A"](A){73},  the  term  A  of 
type  7  is  being  analyzed.  The  sequence  B  contains  the  set  of  branches  of  Elim, 
one  for  each  constructor  of  7.  In  the  case  when  (7*  =  X,  which  implies  that  A  is 
of  the  form  Ctor  (j,  7),  the  Elim  just  selects  the  Bt  branch: 

Elim[/,A"](Ctor (*,/)){£}  Bt 

In  the  case  when  Ci  =  AY :  B.  X,  where  X  does  not  occur  free  in  B,  A  must  be 
of  the  form  Ctor  (i.  I )  A,  with  A,  of  type  B, .  The  Elim  selects  the  II,  branch  and 
passes  the  constructor  arguments  to  it.  Accordingly,  the  reduction  yields  (by 
application  of  the  meta-level  function  <l»): 

Elim  [7,  A"]  (Ctor  (i,  7)  A){B}  Bl  A 

The  recursive  case  is  the  most  interesting.  For  simplicity  assume  that  the  ith 
constructor  has  the  form  (IIY  :  BAX)  — >  IIY'  :  B" .X.  Therefore,  A  is  of  the 
form  Ctor  ( i ,  7)  A  with  Ai  being  the  recursive  component  of  type  IIY :  B'.  7,  and 
A‘2  . . .  An  being  non-recursive.  The  reduction  rule  then  yields: 

Elim  [7,  A"]  (Ctor  (i,  7)  A){B }  Bt  Ax  (AY  :B'.  Elim  [7,  A"]  (Ai  Y){Bj)  A2...An 

The  Elim  construct  selects  the  Bi  branch  and  passes  the  arguments  A±,. . .,  A„, 
and  the  result  of  recursively  processing  A\.  In  the  general  case,  it  would  pro¬ 
cess  each  recursive  argument. 

For  example,  suppose  the  kind  Nat  of  natural  numbers  is  defined  as 
lnd(Nat:  Kind){Nat;  Nat  — >  Nat}, 

with  the  constructor  zero  defined  as  Ctor  (1,  Nat)  and  the  constructor  succ  defined 
as  Ctor  (2,  Nat).  Consider  Elim  [Nat,  A"](A){7?0;  f?s},  where  B0  and  Bs  are  the 
branches  for  the  zero  and  succ  constructors.  Then  we  have: 

Elim  [Nat,  A"]  (Ctor  (1,  Nat)){730;  Bs}  ^ L  B0 
Elim  [Nat.  A"]  (Ctor  (2.  Nat)  N){B0;  Bs}  Bs  N  (Elim[Nat,  A"](N){B0-  Bs}) 
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•  b  Kind  :  Kscm 


(AXl) 


•  b  Kscm  :  Ext 


(AX2) 


A  b  C  :  Kind  A  b  A  :  B  t  ^  Dom( A) 
A.t'-C  b  A:  B 

A  b  C  :  Kscm  A  b  A  :  B  k  ^  Dom( A) 

A ,k:C  b  A-.B 

A  b  C:  Ext  A  b  A  :  B  z  £  Dom( A) 
A,z:C  b  A-.B 

A  b  Kind  :  Kscm  X  g  Dom(A) 

A  b  A'  :  A(X) 

A,  A: A  b  B  :  B'  A  b  IIA :A.B':s 
A  b  XX:A.B  :  IIA  :  A.  B' 

A  b  A  :  nX  -.B'.  A'  A  b  B  :  B' 

A  b  AB  -.  \B/X\A' 

A  b  A  :  si  A,  A :  A  b  B  :  S2  (5i,S2)g7^ 

A  b  IIA  :A.  B  :  S2 

for  all  i  A,  A: Kind  b  Ci  :  Kind  wfcx(Ci) 

A  b  lnd(A :  Kind){C}  :  Kind 

A  b  I  :  Kind 
A  b  Ctor  (i,  I)  :  \I/X]Ci 
where  /  =  lnd(A :  Kind){C} 

A  b  A:  I  A  b  A'  :  /  ^  Kind 
for  all  i  A  b  Bj  :  A',  Ctor  (i,  /)) 

A  b  Elim[/,A'](A){5}  :  A'  A 

where  I  =  lnd(A  :Kind){C} 

A  b  A  :  /  A  b  A'  :  Kscm 
for  alii  small(Ci)  A  b  B-i  :  ^  x,l(Ci,  A') 

A  b  Elim[/,  A'](A){B}  :  A' 
where/  =  lnd(A :  Kind){G} 

A  binds  no  kind-schema  variables 

A  b  A:B  A  b  S'  :  s  A  b  B  :  s  B  =pVL  B' 
A  b  A-.B1 

Fig.  9.  Formation  rules  of  TL. 


(WEAKl) 

(WEAK2) 

(WEAK3) 

(VAR) 

(FUN) 

(APP) 

(PROD) 

(IND) 

(CON) 

(ELIM) 

(L-ELIM) 

(CONV) 
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The  following  two  definitions  introduce  the  meta-level  functions  £  and  T, 
which  compute  the  types  of  the  branches  of  the  small  and  large  elimination 
constructs,  respectively.  The  cases  follow  from  the  /-reduction  rule  in  Defini¬ 
tion  A.  5. 

Definition  A.6  Let  C  be  a  well-formed  constructor  kind  for  X  and  let  A,  B' , 
and  /  be  terms.  We  define  (xj(C,  A,  B')  inductively  on  the  structure  of  C : 

c x,i(X,A,B')  =  AB' 

<;x,i(nY:B.C',A,B')  =  UY :  B.  (x,i(C\  A,  B'  Y) 

(xj{{I1Y:B.X)->C',A,B')  = 

UZ:(nY:B.I).(nY:B.(A  (Z  ?)))  ->  C x,i(C',A,B'  Z) 

where  X  is  not  free  in  B  and  B. 

Definition  A.7  Let  C  be  a  well-formed  constructor  kind  for  A'  and  let  A  and 
/  be  two  terms.  We  define  *1 ’ x,i(C,  A)  inductively  on  the  structure  of  C: 

^>x,i{X,A)  d=  A 

^xj(TiY:B.C',A)  =  BY :  B.^ Xj{C' ,  A) 

-j{B'^C',A)  d=  II/X}B'^[A/X]B'^XJ{C\A) 

where  X  is  not  free  in  B  and  B'  is  strictly  positive  in  X. 

The  complete  typing  rules  for  TL  are  listed  in  Figure  9.  The  three  weakening 
rules  make  sure  that  all  variables  are  bound  to  the  correct  classes  of  terms  in 
the  context.  There  are  no  separate  context-formation  rules;  a  context  A  is  well- 
formed  if  we  can  derive  the  judgment  A  h  Kind  :  Kscm  (notice  we  can  only  add 
new  variables  to  the  context  via  the  weakening  rules). 
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